Data Science On JSON Using Precog Precog and RStudio or Jupyter Notebooks
By Precog Data
March 18, 2019

Obstacles in doing/learning Data Science

Whether you’re a professional data scientist or studying to become a data scientist, you’ll likely need to work with a JSON dataset. JSON isn’t easy to work with. It’s not tabular, and you can’t just push it to a SQL database — at least not without a “bit” of work.

Tabularizing JSON data usually requires writing python scripts. But given multiple JSON datasets there’s a high chance that each one is very different, so your python code is likely not re-usable.

As a professional data scientist, tabularizing your JSON is time you have to factor into your report delivery. You may have the option to use your company’s resources to deliver a tabular form of your complicated JSON. Either way, somewhere along your workflow, someone is spending time and money writing throw away python code to tabularize JSON.

If you are learning data science then you’ve likely bumped into blogs such as?this one. The article spends approximately 80% of the content showing how to explore JSON data using Linux commands and python code. It isn’t until the last few paragraphs where the article dives into answering questions about actually doing something with the data.

What if you didn’t have to spend your time writing one-off python scripts to tabularize your JSON data? You might deliver insight more quickly, or more frequently, or maybe your company would save money on ETL resources.

Imagine if data science students didn’t have to spend time converting JSON data into a table. Instead, they could spend time learning to ask and answer questions about their data.

Imagine if students of Berkeley’s Data 8 course spent zero time tabularizing complicated JSON datasets. Rather, they could spend time applying their skills in a class project involving complicated JSON datasets not previously considered due to the overhead of tabularizing.

In this short blog I’ll show you how to quickly tabularize a 700MB NBA JSON dataset I found online, nbagames.json, using Precog.

I’m an engineer at Precog and I’m very proud of the engineering feat my colleagues have created. I’m here show you how you can benefit from our creation, Precog.

Access your JSON data with ease

Precog Precog can read your JSON data from multiple sources. In this tutorial we’ll be reading a JSON file from an S3 bucket. Download the nbagames.json and upload it to an S3 bucket. There are plenty of instructions on how to do this via a quick google search.

Precog Precog can readily read your data from an API, local file, Azure, etc. Check out [our instructions] ( for these scenarios.

Tabularize your JSON data using Precog

In a 2-minute video, I’m going to show you how to create a table from the NBS JSON data I mentioned above. First, I will connect to my S3 bucket containing the nbagames.json data. Then, I will point Precog to the nbagames.json file and create some columns.

And that’s it! I have just created a table from the 700MB JSON file without needing to explore my data using Linux commands and especially without having to write any python.

Importing the Precog table

Using RStudio

The short video below demonstrates how quickly one can import the table created from the NBA JSON data in Precog using RStudio. Like many of the other tools in this realm, RStudio can read data from a URL, display the columns we selected, and produce a simple bar plot.

Using Jupyter Notebook

Of course, if you like doing data science using python you certainly can! My point is that with Precog there’s no need to write python to tabularize your JSON. Below, I’ve included an example on how to get started with a Jupyter Notebook, pandas, and python.


Yes! It?s actually this easy to get started analyzing JSON data. Just connect Precog Precog to your datasource and select your columns, then import into your favorite data science tool.

Precog has other amazing uses. Would you like to stream data into AWS RedShift, tabularize data stuck in MongoDB, or front your data API with Precog? Let us know, we are happy to help.

To get started with Precog get in touch with our sales people for more information.

Precog Precog is also available on [AWS Marketplace] (

Get started analyzing your complex JSON, skip the python scripts with Precog!


Ready to Start?



A couple of years ago, Forbes began to notice a trend, commenting that “customer/social analysis is considered the second most important big data analytics use case…” With the rise and now dominance of social platforms like Facebook, Twitter, TikTok, LinkedIn, and others, it’s not h

Read More
Walnut St. Labs

Precog lets us prototype analytics projects quickly — building marketing dashboards based on data from a variety of sources — without needing a data engineer or developer — we create new data sources in a few hours to sources like Brightlocal, a popular local SEO SaaS solution, and h

Read More
Chris Dima - CEO

We welcome Precog to the Alteryx technology partner ecosystem as a partner extending the capabilities of our platform, further simplifying analytics for our customers.

Hakan Soderbom - Director of Technology Alliances

Enterprises struggle to understand and trust the data sources powering their business analyses,” said Jon Loyens, co-founder and chief product officer at “Adding ways to integrate sources to our catalog introduces more flexibility to our users, increasing their efficiency a

Read More
Jon Loyens - Co-Founder and CPO

We recognized a need in our customer base to perform advanced analytics on SAP data sets — we performed an extensive evaluation of Precog and chose it as a strategic solution for our go to market needs based on its performance and given their strong strategic relationship with SAP.

Alfredo Poncio - CEO

Precog changed the game for us — instead of grueling data integration work, Precog offers a ‘connect and go’ experience — this allows us to reallocate resources to our product and our customers.

Sam Darawish - CEO

Precog is the vital tool in our ability to pull data from a variety of business sources quickly and cleanly. Our internal MongoDB backend, as well as other cloud services like Hubspot, were a constant challenge to the business teams desire for reporting data prior to using Precog. With the

Read More
Josh Wilsie - VP

Precog lets us quickly build complicated dashboards and BI queries without being constrained by our MongoDB schema. The Precog team provided expert support in recommending how to fit Precog into our BI system and helped every step of the way.

Fred Cook - Co-Founder and CTO
Gold Town Games AB

Given our experience, we approached Precog with skepticism, but to our surprise Precog lives up its word; and now we have a working analytic environment in Google BigQuery.

Patrik Berggren

Precog fills a huge gap in the business analytics arena by dramatically simplifying the movement of data. Anyone serious about enabling business analysts and data scientists via self-service data should consider this product for their toolkit.