The brew Common Vocabulary¶
tl;dr – The Common Vocabulary represents brew's data types and corresponds to the icons represented by inputs and outputs of functionals. This is needed in order to ensure that functionals speak the same language.
What is the Common Vocabulary?¶
In short, the Common Vocabulary (CV) allows brew functionals to use an agreed upon format for various data types to ensure they can communicate with one another.
The brew Common Vocabulary is comprised of a set of data types that define the syntax and semantics of the data that moves through a brew workflow during its execution. Functional developers define the inputs and outputs to and from their functionals in terms of this vocabulary. The Common Vocabulary standardizes data representation reducing the complexity of connecting disparate data sources and types together in one environment.
When building functionals, functional developers abstract the normalization within the functional to ensure these data standards are met. Thus, the end user does not have to deal with the minute details of normalizing things such as timestamps, coordinates, csv vs xls, etc.. This allows an analyst to build coherent workflows very quickly while at the same time allowing developers to integrate new functionals into a common environment where they are guaranteed to work with all of the existing capabilities.
The Common Vocabulary is the foundation of the brew self-service data analytics platform, making interoperability possible while maintaining ease of use for end users.
The Common Vocabulary is represented by a functional's input and output icon(s).
Technical Information¶
The Common Vocabulary Data Types are the supported data types used in brew Functionals. CV Data Types are a class hierarchy of concrete classes with CV_Super as the abstract super class. The concrete classes are primitive, temporal, Well-Known Text (WKT) or referential data types. The primitive, temporal and WKT classes are direct objects that contain their value (e.g., CV_Integer contains a Long and CV_LineString contains a string representation of a line). The referential classes contains a pointer to the value (e.g., CV_Table contains a Universal Resource Indicator (URI) that points to the table persisted in a database).