What To Do When You Don’t Have A Data Integration Engineer
By Mike Corbisiero
January 28, 2019

You’ve probably been hearing about “big data” for a while now the term has been around for years, and in the meantime, data has only gotten bigger. A whole industry has sprung up around it, from collection to storage to analytics, and data integration engineers are part of that puzzle. But what if your company doesn’t have one?

Big Data Is Getting Bigger

Companies have been tracking data for a long time. What they sell and when, customer demographics, busiest times of day and year, website traffic and sources, social media engagements, and so on are all data that can be turned into analytics-ready tables and used to draw conclusions and make predictions about your users.

But recently, the world of consumer data has gotten a lot more complicated. SaaS companies have exploded in popularity and brought with them huge amounts of complicated usage data. Dropbox, for example, isn’t just tracking how many times users log in. It’s tracking what kinds of files they upload, when they do it, how long it takes, which ISPs they use, how many folders they use, and dozens of other data points. Something as simple as saving a Word document can generate a huge amount of data.

Mobile app usage has also taken off, bringing with it its own set of complicated data. App makers want to know who’s using their app, where and when, what they’re using it for, how long they’re in the app, which features they use the most, and so on. Add in a social aspect, like with multiplayer gaming, and there’s a whole other set of data about how users interact to accompany it.

Finally, the Internet of Things has officially arrived. The concept has been around for almost 20 years, but it took a while to be realized. The number of IoT devices connected to the internet not phones and computers but smart appliances, voice assistants, wi-fi enabled plugs, lights, switches, and cameras, and the like has increased more than tenfold in the last ten years. Some projections say that there will be 50 billion connected devices by 2020.

And every single one of them is generating data on a second-to-second basis. A smart bulb is registering every time it’s turned on and off, which color the user selects, whether they used their phone, smart watch, or voice assistant to do it, which room it’s assigned to, and more.

The Complex Data Problem

The result of this explosion of data is that companies have never had more data to work with but how are they supposed to do anything with it? Most of this data is in the form of JSON data a JavaScript-based format that’s incredibly flexible and therefore popular with developers, but also complex and difficult to parse. Its flexibility means developers can collect and store data in far more complicated models than a traditional RDBMS would support and to quickly change and update schemas as needed to support rapidly changing applications. But all this flexibility comes at a price performing analytics on the data is incredibly complex.

The Role Of A Data Integration Engineer

This is where a data integration engineer comes in. There are lots of tools to perform traditional ETL (extract, transform, and load) or ELT tasks, but they all fall short when it comes to complex JSON data.

Usually, an engineer is required to bridge that gap. The engineer will manually write code, often in Python, to deal with the complex and constantly shifting nature of the JSON data. This is a slow, tedious process. It’s also costly engineers are highly trained in a very technical skill, and most small companies can’t afford to hire them on full-time.

Even big companies that can afford a whole floor full of data integration engineers run into another problem that custom code isn’t flexible. If marketers or CIOs make a request for a certain set of analytics-ready data over a certain period of time, it can take weeks or months to extract that data from the raw JSON. If they have follow-up requests or want to see a different time span, the clock starts again. So what’s a small business supposed to do?

Option 1: Contract Out The Work

There are a lot of companies that can be hired on a contract basis to parse data, and a lot of them claim to be able to handle JSON data. When pressed, though, they all admit that they’ll need to write custom code to handle the sort of complex JSON data that’s common in today’s world. You can hire these companies to parse your data for much cheaper than hiring your own engineer, but you’ll still run into the same issues of time and inflexibility.

Option 2: Precog Precog

For decades, all data parsing tools have been built on a type of mathematics called, relational algebra, but RA falls short when it comes to the complex nested data points of JSON. Precog created a novel but powerful underlying algebra called multi-dimensional relational algebra, which allows us to convert any data, regardless of complexity, into an analytics-ready tabular format in seconds.

More importantly, Precog’s interface is no more complicated than a filesystem. Anyone, from a data engineer to a c-suite exec to a marketer or analyst, can extract the data they need and run the analytics they need, complete with follow-up queries. Precog is the only tool in the world built on this framework. In short, the days of waiting on a data integration engineer to deliver curated data are over.


Ready to Start?



We chose to use Precog because they were the only company willing to handle our complex data connections. Precog was extremely helpful getting us set up and running smoothly. Since then it has been one of those tools that just works solidly and reliably which is one less thing our team nee... Read More

Derek Binkley - Engineering Manager

Precog is an important partner for Cured and a critical member of our data stack. The Precog platform has delivered data connectors to necessary data sources other vendors could not or would not, and in a very short timeframe. The product is intuitive, efficient, cost-effective, and doesn&... Read More

Ashmer Aslam - CEO Cured
Walnut St. Labs

Precog lets us prototype analytics projects quickly — building marketing dashboards based on data from a variety of sources — without needing a data engineer or developer — we create new data sources in a few hours to sources like Brightlocal, a popular local SEO SaaS solution, and h... Read More

Chris Dima - CEO

We welcome Precog to the Alteryx technology partner ecosystem as a partner extending the capabilities of our platform, further simplifying analytics for our customers.

Hakan Soderbom - Director of Technology Alliances

We recognized a need in our customer base to perform advanced analytics on SAP data sets — we performed an extensive evaluation of Precog and chose it as a strategic solution for our go to market needs based on its performance and given their strong strategic relationship with SAP.

Alfredo Poncio - CEO

Precog is the vital tool in our ability to pull data from a variety of business sources quickly and cleanly. Our internal MongoDB backend, as well as other cloud services like Hubspot, were a constant challenge to the business teams desire for reporting data prior to using Precog. With the... Read More

Josh Wilsie - VP