Financial news/aggregator

TechVariable develops modular information aggregation platform for Director Intel






Increase in Data Accuracy


Based in Colorado, Director Intel is an aggregation platform of information related to companies and company directors listed on the Russel 3000 Index, based on data available with the U.S. Securities and Exchange Commission (SEC).


Rapid Application Development, ETL as a Service


Python, Django, CoreNLP, NLTK, Beautifulsoup, Selenium, Pandas, IEXcloud, Bing News, PostgreSQL, SMTP server, Azure application insights, ReactJS




The company wanted to build a real-time aggregation platform for the Russell 3000 Index, an equity index of 3,000 of the most extensive US-traded stocks.

Future users of the platform would get accurate information sourced from the SEC, such as the valuation of a director’s stocks, contact information for the board, and the company’s Environmental, Social, and Corporate Governance (ESG) record. Similar information available in the old version of the platform was:

  • Unfriendly to use: Cluttered with legal jargon and not organized for ease of use.
  • Segregated, time-consuming to sort: Needs the browsing of multiple sites, for which senior executives do not have the time.
  • Not shareable: Users cannot share information reviewed with peers via email.
  • Lack of premium content: No scope for users to create white papers or other valuable collaterals for the business.


Director Intel chose TechVariable to develop the platform because of its technical expertise, offshore development capabilities, and proven fast turn-around. We worked on the initial scope for three months as planned. Director Intel has since decided to extend its engagement with us to strengthen the project further.

  • Create a scalable, modular platform that can expand features with business growth
  • Qualify for listing on RapidAPI – a centralized internal marketplace for APIs.
  • Ensure reliability of service and protect sensitive data.

We have taken a microservice architecture approach since the client wanted to make it scalable in the future as needed. Also, the client gave the scope in a phased manner. we used the Django rest framework as a back-end web framework. For DB we have used PostgreSQL and for the front end, we have user react JS.

The crucial information processing happened using some NLP algorithms since most of the data was publicly available in textual format but highly unstructured. For example to extract meaningful entities we have used the Spacy custom pipeline as an entity extractor. Based on those entity values we tried to come up with a relevance score for each of the documents we have scraped from the web. The scoring algorithm is based on semantic analysis and word embedding. After finding high-scoring documents from the web, we use our custom-made document parser to extract and visualize relevant info in the front end.

There are some third-party services like iexcloud being used to get real-time information like stock price etc.

For the searching mechanism, we used Elasticsearch to reduce the load on the DB as well as to improve search results.

Modules implemented

Data Extraction

This module was responsible for extracting data from public sources. The data was in textual format and unstuctured.

NLP module

This module was responsible for providing context to the data. This module scored the data based on relevency and associated the data with appropirate business entity.

Elastic Search

This module is build on top of elastic database so that it can provide a robust searching and sorting functionality. We implemented fuzzy and phonetic search here.

High Level Design Architecture

Need to estimate for your next project?

We at TechVariable do acknowledge that one size will not fit all. Hence, we work in collaboration with you to identify, analyze & then develop a solution that fulfills your needs. Either we will define the functional scope of your project to estimate the timeline and budget or you can create your own agile team from among our resources.
estimate project

The Result

1) Easy indexing using elastic search made the resources easily searchable.
2) Ability to share resources within the platform and outside via shareable links helped the reach of the platform.
3) Use of NLP to extract data from an unstructured format and showing as a report helped the client to visually understand the data at a glance.

Previous slide
Next slide

See how our solutions are making a difference in healthcare