Empowering market data consumers and vendors.
DataHex ingests, stores, enhances, transforms and delivers data efficiently, so your team can focus on what truly matters.
Ultizing NLP to enhance discovery for unstructured data
- Third Bridge is a market-leading global investment research provider for human-led insights to support capital markets firms with their decision-making process.
- Third Bridge engaged RoZetta to provide an innovative solution to automatically tag forum transcripts with companies from the reference data set previously mentioned in interviews.
- Customers needed a better search experience and wanted to discover relevant content more efficiently.
- Tagging the transcripts to provide a strong foundation for additional enhancements.
- RoZetta’s data science experts, and DataHex platform, mapped entities within 23,000 transcripts, from which over 2.9 million entities were mentioned, identifying 125 entity mentions per transcript on average.
- Enabled linking to additional data sources such as company fundamentals, news, data from other providers and alternative sources of unstructured text.
- Tracked sentiment of entities over time.
- Automated summarization of transcripts.
- Additional entities such as People, Locations and Industries were tagged.
- Developed an objective relevance measure for the identified companies mentions, initially based on transcript content with the ability to further enhance customer activity insights.
RoZetta was able to successfully tag over 152,000 entities by leveraging NLP methods. This was ten (10) times more tags than previously identified by the client
RoZetta’s models achieved discoverability of entity mentions by 98.3% versus 37.4% current state, resulting in a substantially better search and transcript filtering experience, generating more relevant results and watch list notifications
Automated summarization reduced manual processes and improved efficiency
Additional entity tags allowed the client to enhance its search and discoverability
Linking various datasets meant the ability to extract relationships between entities
Sentiment Index efficiently generated insights into the perception of the market
Improved customer interface increased client engagement, reducing attrition, propelling customer growth, and lifting revenue
Hedge fund - Capital markets
- A global market maker with decades of historical tick data requires a data management solution to resolve the following issues:
- Reference data with invalid or missing links
- Expiry dates were not available for some instruments
- Multiple expiry dates for other instruments
- Missing reference fields and incorrect field lot units
- Option chains contain unrelated instrument codes
- Incorrect values in the intraday data, requiring recalculation from raw tick data
- Incorrect symbology mapping
- Establish an instance of DataHex in the client’s AWS environment
- Download, consolidate, validate and ingest historical and ongoing market data (from multiple data vendors.)
- Identify missing or corrupt data and liaise with Data Vendors to replace.
- Remove irrelevant instruments from historical option chains.
- Create and maintain Security Master reference.
- Resolve data quality issues e.g. incorrect expiry dates, currency codes and last trade dates.
- Want to know more about our Data Enhancement Services technologies click here
DataHex platform optimizes storage, searching, querying, and extracting to provide rapid discovery, selection, and extraction within a cloud environment
DataHex enables seamless transformation and delivery into multiple cloud platforms and analytical environments
DataHex is data source agnostic and has ingestion pipelines for data sourced directly from multiple data vendors
File fragments integrated into an immediately useable format
Invalid and corrupt data issues are promptly managed with Data Vendor
Mapping of the Symbology table to internal security identification tables
Publish a data calendar highlighting a list of known issues
Incorporate the calculation of one-minute timebars in the ingestion process
Data is presented in an analytics ready state. Minimized the data wrangling and manual validation of data, reducing overall data management costs
Automatic validating, updating, and maintenance of Security Master
API and GUI to search, schedule and extract data by instrument, portfolio, asset class, or exchange by time and date range
Data can be transformed and delivered into multiple cloud formats on extract
Streamlining access to ICE market data in the cloud
- Intercontinental Exchange (ICE) is a leading provider of data, technology, and market infrastructure to capture tick-by-tick data for trades and quotes across multiple assets
- To maintain a competitive edge ICE has joined forces with RoZetta to offer their clients flexible delivery and the ability to incorporate data enhancements during this process
- Within Capital Markets the increasing costs and complexity of managing large-scale data remain a constant challenge as data continues to grow exponentially
- Quantitative analysts and Data Scientists are adopting new cloud and analytical platforms to develop value-creating trading strategies
- RoZetta delivered a Market-Data-as-a-Service pipeline to streamline market data transformation and enhancements, significantly reducing data wrangling and data management costs for ICE and their client base
- This partnership will provide analytic-ready data, transformed, enhanced, and delivered seamlessly to accelerate productivity gains for all end users spanning trading, research, compliance, and risk management
- The solution allows ICE’s clients to select a subset of data fields to be transformed for use in higher-cost environments while delivering the whole extract to lower-cost file storage
- Want to know more about our Data Enhancement Services technologies click here
DataHex enables ICE clients to have licensed data seamlessly transformed and delivered into multiple cloud platforms and analytical environments to minimize time spent on data management tasks by high-value specialist roles
Incorporating data enhancements optimizes the processing and ingestion costs for the client
This service is a productivity tool for roles requiring analytics-ready data, reducing time spent on data wrangling while optimizing the total cost of ownership
Global financial markets historical data platform
- Opportunity existed to better support decision making in financial markets by providing accessible and usable financial data at scale
- Required a solution to accommodate:
- Structured and unstructured financial market data sets – tick data for more than 450 global exchanges
- Scale to cope with over 3 petabytes of data
- Data including over 10 billion transactions daily; 15 years of historical data and over 85 million financial instruments
- The solution was to offer a ‘bigdata’ solution before such a term existed
- Required a managed service to provide full end-to-end operational support
- To maintain a highly resilient stable platform to a demanding client base
- There was an opportunity to partner with Thomson Reuters who were looking to better service the market using a trusted technology partner in RoZetta Technology
- Through design, build and operations an on-premise technology solution was architected and imbedded with data science tools to enable effective ingestion, transformation and presentation of financial market data
- The platform was fully managed – for over 15 years providing 24/7 global support and system maintenance
- Want to know more about our Managed Service Platform technologies click here
Scalable, agile, architecture able to support required performance requirements
Used by over 650 clients representing over 90% of the world’s largest banks and 80% of largest global hedge funds
Delivered a long running, highly resilient solution – generating tens of millions of dollars in revenue for Thomson Reuters annually
Processing over 25 million client requests each year
With over 99.97% platform and data availability since 2008
Cloud migration and new product in historical data offering
- With rising demand for tick data history, Morningstar set out to lift the performance to make it quicker and easier for clients to access the tick data offering
- Required modernisation of a tick data technical infrastructure, to migrate to cloud technologies and introduce new tools to improve product offering.
- Moving from legacy on-site storage using single-threaded process. This previously required data copied to hard drives and shipped via courier
- Required a migration and conformance of a complex dataset covering:
- Over 2.5 petabytes of tick level 1 & 2 market data, 50 million instruments
- Covering over 200 trading venues and circa 99% of global equities coverage
- Data dated back to 2003 and included 10-years USA composite data, exchange messages and outage information
- Required a capability to quickly filter, extract and engage the data points including trade date and time, exchange time, volume, trade price, last bid and offer
- Full-service cloud migration to native AWS serverless technology environment
- Ability to ingest, curate and manage a considerable range of market data sets originating from global exchanges and markets
- Client interface/shop front to enable direct login access and purchases
- Additional mapping tools introduce to enable easy adoption to all major instrument codes
- Want to know more about our Morningstar Tick Data Solution click here
Fully scalable, agile, architecture able to support required performance requirements of a multi-petabyte operation
High availability, security, and resilience with data accessible through a range of interfaces such as API, React GUI, FTP, AWS S3 and more
Sped a typical customer extraction from an 8-week deliverable to less than 2 hours
Reduced barriers to adoption through effective instrument mapping tools