This session outlines the current efforts by Creative Commons’ (CC) to identify and catalogue 1.3 billion works that have been published under a CC license. This endeavour has many challenges, some of which includes: 1)how do we identify the various works on the open web platforms, 2)what is the most efficient way to process and store the data and 3)how do we make it accessible to the community. The team's current strategy will be outlined, which includes a discussion on the above challenges, best practices for analyzing data on the open web platforms and will highlight the preliminary results.