How Airbnb rebuilt its employee-facing data resource portal on top of a graph database

3 years ago admin Comments Off on How Airbnb rebuilt its employee-facing data resource portal on top of a graph database

Airbnb has completely rebuilt its employee-facing data resource portal in an attempt to democratise reports and dashboards and to encourage a data-driven approach across the organisation.

The portal provides access to information such as sales data, user metrics or app performance and is now accessible to all Airbnb staff, be it a data scientists, software engineer or customer service agent.

Speaking at graph database vendor Neo4j’s annual GraphConnect conference in London yesterday software engineer John Bodley and Airbnb data visualisation engineer, Chris Williams, explained how they rebuilt the back and front-ends of its employee-facing data portal.


Bodley spoke about the importance of moving Airbnb away from “tribal knowledge” – information held solely by groups, like data scientists, within an organisation – as a key incentive for building the new data portal, as this “often stifles productivity”, he said.

Bodley said that employee surveys often score poorly around the statement: “Information I need to do my job is hard to find”. The data was often siloed, inaccessible and lacked context.

However, if they wanted to democratise the data they needed a front-end that was accessible to less data literate employees, and, with the appropriate guardrails up to stop them from breaking anything upstream.

Williams said that although “the back-end of data tools are often so complex that the front-end is an afterthought, this should not be the case”. In fact, the density and complexity of the data involved makes simple, intuitive design paramount if you want widespread adoption.

“We tried to embrace a clean and minimalistic design to maintain clarity despite all of the data content,” Williams said. “We also had to make the app fast and snappy as slow interactions generally disincentivise interaction and exploration.”

Williams did admit that “most internal design patterns aren’t designed for data rich applications, so we had to do a lot of improvising.” The final product looks very much like an Airbnb application, resembling the consumer-facing web-app, if it was designed for data tables instead of beautiful apartments.

Read next: The power of employer branding

Next, the web app had to have “killer search functionality”, Williams said. Borrowing from Google’s design the data portal has search filters for data resources, charts, groups, teams and people. Then information is displayed hierarchically along with resource metadata to give context, so that users can quickly gauge its relevance. Lastly, the search results will show the top consumers of the resource “to quickly surface relationships and context”, Williams said.

This level of context is important because “assessing who may be the right point of contact for a resource is just as pertinent as the resource itself,” according to Bodley.

Williams also noted that teams often had a defined set of “tables they query, dashboards they look at, key metrics” so they created group spaces where teams could “organise, curate and quickly link people to these resources”.

The data portal also allows users to ‘pin’ and favourite resources, look at high level company dashboards and explore data lineage and related content to encourage exploration.

Airbnb went live with its new data portal in December and Bodley told Computerworld UK that they have already seen the user base jump from “between 30-40 employees a week, to nearing 500 now”. Airbnb employs more than 3,500 people, distributed widely around the globe.

Bodley says there is a major education piece in motion now to try and get more people using the portal and to also teach the less data-literate employees how to get the most out of the resource.


Airbnb is a platform for linking together people with properties to rent out for short stays, with travellers looking for unique travel experiences. The important part is the connections between hosts and travellers, so “our data represents a graph, so it felt logical to use a graph database to store the data,” Bodley said.

“Our ecosystem is a graph and the data resources are nodes and the connectivity is relationships,” Bodley added.

Read next: What is a graph database – and should you care? Graph database vendors, graph database use cases and graph database customers explained

Airbnb has over 200,000 tables of data stored in an Apache Hive data warehouse, spread across clusters, including more than 6,000 Tableau workbooks and charts. So, to build this data portal Airbnb needed to push this data into Neo4j’s graph database and then start to layer web services and search capability on top.

As Williams put it, the data portal is “an umbrella data tool trying to bring all siloed data tools together to generate a picture of the overall data system.”

So the data journey looks roughly like this: “Every day the data starts in Hive, we use Airflow to push it to Python, in Python we have the graph represented as an object and from this we compute a weighted page rank on the graph and that helps a lot to improve search rankings.

“The data is then pushed to Neo4j. From here it forks, nodes get pushed into Elasticsearch via a GraphAware plugin and from there Elasticsearch will serve as our search engine, which is a fairly common technology in Airbnb.

“Results for Elasticsearch queries are fetched by the web server and additionally results to Neo4j queries are fetched via Neo4j.”

Lastly, on the front-end Williams said that the data portal doesn’t allow for “freeform exploration of our graph, as the Neo4j UI does,” because they wanted to maintain a “curated view of the graph which attempts to provide utility while maintaining guardrails for the less data literate employees”.

This story, “How Airbnb rebuilt its employee-facing data resource portal on top of a graph database” was originally published by
Computerworld UK.