More details about Precog

Self-service and pre-preparation

Traditionally data is pre-prepared into a normalised schema in a warehouse for analysts. Precog speeds up both the compute and labour of this process as well as ensuring accuracy.

There is an alternative approach. Providing analysts with direct access to Precog enables them to fulfil their own data requests by picking the exact data they need for their analysis. This enables analysts to find the needle in the haystack, improves performance and eliminates the time and cost involved in preparing, updating and storing un-needed data.

Both approaches have benefits and you should choose the one that is right for your team. Precog will provide improved performance and ease of use, whether you choose to prepare the data in advance for your analysts or whether you give your analysts the self-service power to fulfil their own data requests.

Self-service and pre-preparation

Each report contains a list of drugs and a list of reactions. We picked data about reactions and drugs in the same table. As such, Precog provides us with a table where the number of rows is the number of reactions multiplied by the number of drugs.

Instead, we could pick one table for reactions and another for drugs. This alternative approach provides a normalised schema over the data which can then be joined together as needed. This is useful, however it pushes the responsibility for joins onto downstream software such as Power BI or Snowflake.

Precog is powered by proprietary technologies that make such joins orders of magnitude faster. Picking a table containing the information needed for your analysis can substantially improve query times and reduce compute and storage costs compared to the traditional approach.

Streaming, virtualisation and scale

Tables in Precog are streaming and virtualised. We can think of this as though the Precog table were a grocery list rather than the groceries themselves. We can use the same list in different stores the same way we can use the same Precog table with different datasets. We can also export tables and import them into different copies of Precog.

Precog does not load the whole dataset into memory, nor does it store the data at rest. Precog performs streaming selects, filters and tabulation at the byte level and streams the data to the destination. This approach reduces memory and storage requirements. Combined with Precog's built in clustering, this makes scaling straightforward and efficient as well as enabling more efficient cloud solutions.

Previous page: Analysing the tabulated data with SQL