Open data content and discovery platform Enigma has announced investor funding of $4.5 million today to propel the start-up’s next phase of growth. Key to Enigma’s early success has been a heavy focus on data cleaning and a customer on-boarding strategy that has targeted just a handful of industry sectors. The next phase will see the development of “more scoped APIs” and a broader targeting of enterprise customers, according to Marc DaCosta, cofounder of Enigma, who spoke with ProgrammableWeb in the lead-up to today’s announcement.
Creating an open data platform as a viable business model has been a challenge for many early market players in recent years. In 2012, New York start-up Enigma.io was founded, hoping to take a different approach to previous efforts at creating an open and public data library for business customers. Since its public launch in May 2013, Enigma has won awards, gained initial seed funding of $850,000, and created excitement among early adopters. Today, it has secured $4.5 million in a Series A funding round from Comcast Ventures, American Express Ventures, Crosslink Capital, and the New York Times.
“Essentially, we are using the round to grow our engineering and sales teams aggressively,” Cofounder DaCosta told ProgrammableWeb.
So far, Enigma has been able to avoid some of the pitfalls of other data platforms seeking to build a viable business model. One of the risks is the propensity to get lost in the potential opportunities that come with curating a large and disparate array of public datasets. Collecting data on everything from real-estate assessments, product imports, industry workforce statistics, SEC filings, and government spending contracts can create a sense of overconfidence that leads to a start-up trying to be all things to all industry sectors. Enigma has avoided this by focusing its customer on-boarding on just a handful of initial industry sectors, including financial institutions, media, and professional services.
“We will be continuing our growth in the banking and financial sectors,” DaCosta said. Matching one of Joel Gurin’s predictions that 2014 will see a greater number of financial industry players making use of open data, DaCosta sees Enigma as being able to grow its customer pool beyond the current hedge funds that have been early adopters of the platform.
“We see two main drivers, less in the direct hedge funds and more in banking and credit underwriters and property insurance. With credit underwriters, lenders have had a data deficit in which they feel they don’t have enough information for lending decisions. Where Enigma is able to add value is to give underwriters a lot more information through our data interface, which in turn lets lenders expand financial support access to more small businesses and companies. In property insurance, we are able to rationalize building and neighborhood data so that insurance advisers can make better decisions, like asking what the impact is if there is a doctor’s office in the building, what the impact is of a liquor store opening in the neighborhood, and how does that correlate with insurance risks.
“These are illustrative examples. They speak to a pattern of thinking. It is an exciting time for using public data in this way: There is a tremendous amount of insight and data available.”
Another aspect to Enigma’s early success, DaCosta believes, is the data acquisition approach of the platform overall. “For us, to really deliver the most value in this space, we have needed to take data from the wild; how we have cleaned, linked, and geocoded the data is baseline to making it useful. Then, as we have learned what the interesting use cases are, we have been able to move up the value chain and create more context around the data. We believe that this is taking a different tech approach to most of the work done in other data portals. Most are doing [data acquisition] in a one-off way and then cleaning it to fix idiosyncrasies [such as data with spelling errors or multiple date formats]. We have built a more generalizable platform. For every new dataset that comes into Enigma, we learn how to clean and link it in a systemic approach. We also use other heuristics to identify contextual clues that feed into our linked data.”
According to DaCosta, this approach means that the Enigma platform is more flexible at providing tools for data discovery. “Most data portals lock you in so you are only able to access metadata. Enigma enlightens identities across the datasets rather than, say, the Bloomberg approach that is still siloed. We are at a much more scalable position and can continue to develop high-quality APIs, as we have put the stake in the ground as far as building a data acquisition strategy that is unique, cleans the data, and lets customers interact with all of these datasets in the same way.”
Enigma will use the new funding to focus on building more API tools that allow customers to feed data from Enigma into their own proprietary layer, in whatever ways they prefer to display and use the results.
“The API calls [to Enigma] have been growing at a heavy clip for us,” DaCosta said. “From a technical perspective, we hope to move into more scoped APIs: creating specific graph models based on this data. So, from a headcount perspective, we will be skewing on the engineering side to develop high-value products. We will be looking at who is consuming data, what they are trying to do, and how data can help them.”
While maintaining a heavy focus on Enigma’s already identified industry sectors, DaCosta hopes that the funding partnerships with stakeholders such as American Express Ventures will also mean widening use of the platform among enterprise customers. “There is real fluency across the Fortune 500 set that, with public data, ‘there is gold in them hills,’ so to speak. The value that we are bringing – and the trust that we are building – is that there are now entire troves of data available to enterprises that can help them attract new business.”
By Mark Boyd. Mark is a freelance writer focusing on how we use technology to connect and interact. He writes regularly about API business models, open data, smart cities, Quantified Self, and e-commerce. He can be contacted via e-mail, on Twitter, and on Google+.