JSON to Insights: Tabulating Non-tabular Data with Precog
By Jeff Carr
December 9, 2019

Tabulating non-tabular data with Precog

With Precog, we connect directly to the source of the data. In this case the source is the FDA.gov Web API. This ensures that we are always working with the latest data including new records and corrections. There is no need to download, stage or import the data. Self service users can add their own sources and commonly used sources can be added in advance so that self-service users don’t need to do this themselves.

Like many web data sources, FDA.gov provides data in a paginated manner. This means that the dataset is not available as a single document, rather it is made up of multiple documents. Precog handles this pagination for us.

We’ll use the following datasets in our analysis:

Adverse events relating to NSAID drugs indicated for Osteoarthritis:

https://api.fda.gov/drug/event.json?limit=100&search=patient.drug.drugindication.exact:OSTEOARTHRITIS

In order to retrieve all the pages we’ll need to get an API key from FDA.gov, provide this as a basic authentication username and select Paginated via header as the request type. Precog also supports offset and token based pagination.

Screenshot of adding the datasource in Precog

Precog instantly shows us the available data and allows us to browse it. To tabulate the data we simply pick what we are interested in and it appears as columns in the table below. Here we are interested in the id of the reports, as well as reactions and information about drugs.

Screenshot of browsing and picking data in Precog

Now that we’ve picked out our table we can analyse the data directly, using software such as Power BI, DataRobot, Alteryx and ThoughtSpot, or push it into warehouses and databases such as Snowflake and Postgres. Every time we access the results Precog provides the latest data from the source.

Screenshot of some table access options in Precog. These include S3, Snowflake, ThoughtSpot, PostgreSQL, DataRobot, Power BI, Alteryx and more.

We can think of using Precog like online shopping, except instead of a cart we have a table. We can browse, pick what we want and then check out. Now that we’ve “checked out” we can use software such as Power BI, ThoughtSpot, Looker, Tableau or SQL to analyse and manipulate the data.

To summarize, we have:

  • Added the FDA Web API as a data source, letting Precog automatically deal with the paginated data
  • Browsed the available data fields from the non-tabular data
  • Picked the data were are interested in, letting Precog automatically tabulate the data for us
  • Loaded the tabulated data into Snowflake or Power BI

By doing this we have now successfully paged, downloaded, imported, explored and tabulated the non-tabular data.

Previous page: Analysing non-tabular healthcare data

Next page: Tabulating non-tabular data without Precog

NEWS & BLOG

Ready to Start?

FROM OUR CUSTOMERS

GiddyUp

Precog delivers on the dream of simple data architecture that is roaring across the world. Precog solves all these problems, keeping your warehouse up to date with all the data you need and making the ELT dream a reality.

Venkatarama Cherukupalli
Walnut St. Labs

Precog lets us prototype analytics projects quickly — building marketing dashboards based on data from a variety of sources — without needing a data engineer or developer — we create new data sources in a few hours to sources like Brightlocal, a popular local SEO SaaS solution, and h... Read More

Chris Dima - CEO
Alteryx

We welcome Precog to the Alteryx technology partner ecosystem as a partner extending the capabilities of our platform, further simplifying analytics for our customers.

Hakan Soderbom - Director of Technology Alliances
Data.World

Enterprises struggle to understand and trust the data sources powering their business analyses,” said Jon Loyens, co-founder and chief product officer at data.world. “Adding ways to integrate sources to our catalog introduces more flexibility to our users, increasing their efficiency a... Read More

Jon Loyens - Co-Founder and CPO
SouthEnd

We recognized a need in our customer base to perform advanced analytics on SAP data sets — we performed an extensive evaluation of Precog and chose it as a strategic solution for our go to market needs based on its performance and given their strong strategic relationship with SAP.

Alfredo Poncio - CEO
SouthEnd
Everflow

Precog changed the game for us — instead of grueling data integration work, Precog offers a ‘connect and go’ experience — this allows us to reallocate resources to our product and our customers.

Sam Darawish - CEO
SendaRide

Precog is the vital tool in our ability to pull data from a variety of business sources quickly and cleanly. Our internal MongoDB backend, as well as other cloud services like Hubspot, were a constant challenge to the business teams desire for reporting data prior to using Precog. With the... Read More

Josh Wilsie - VP