Zillow Uses Analytics, Machine Learning To Disrupt With Data


Since the release of Office 365 five years ago, many of the "easy" Office 365 migrations have been

Industry disruptor Zillow leverages data about residential real estate and makes it available to the general public. The company's senior director of data science and engineering shares the secrets behind Zillow's data stack.

Residential real estate site Zillow stormed onto the market in the 2000s, letting consumers check on the property value of their own homes and those of all their friends, family members, and acquaintances, too, much to the dismay of real estate professionals.

Founded by a couple of former Microsoft executives who went on to start travel site Expedia and then Zillow, this site threatened to disrupt the real estate market when it debuted in 2006. It gave people access to information that had previously only been available through real estate pros.

Ten years later Zillow has proven it has staying power. Built on the idea of ingesting, processing, and serving data from multiple sources to consumers, the company has made a name for its "Zestimate" -- its secret data-driven formula for predicting the value of a piece of real estate. But none of this happens without a sophisticated IT department and data operation behind the scenes.

[Can machine learning impact your enterprise? Read What eBay's Machine Learning Advances Can Teach IT Professionals.]

Jasjeet Thind, senior director of data science and engineering at Zillow, says that Zestimate is one of the ways Zillow uses machine learning. This real estate value estimate was the first available home valuation model, and it's composed of hundreds of models behind the scenes -- linear models, decision trees, deep learning, and more -- to predict values for every single home in the country, Thind said.

Thind gave IT and data professionals an inside view of what is under the hood at Zillow during a presentation at September's Strata + Hadoop event in New York.

"Zillow Group's mission is to build the largest, most trusted, and vibrant home-related marketplace," he said during the session. Zillow Group refers to the company that Zillow has grown into in the decade since its launch. Now a publicly held company, Zillow owns several brands, including Trulia, HotPads, StreetEasy, Naked Apartments, Mortech, dotloop, and Retsly.

Thind said that Zillow operates a data lake composed of data from all those brands. It also gets data from counties, the MLS, real estate brokers, and directly from users via the "Claim Your Home" feature. Thind said that Zillow's ability to get updated information directly from homeowners is one if its key competitive edges.

Data obtained from government records can be tricky and not very glamorous to ingest. Some of this property data is in JPG form, while other data is typed text. Thind said that Zillow leverages OCR technology in its ingestion process to help optimize costs. Because the data can be input faster, the system also improves user experience.

Ensuring data quality is a big topic at Zillow, Thind said. Public records data comes in many different formats, and the company employs a data analyst whose full-time job is to ensure data quality. Zillow uses trend detection to look for anomalies in number of sales transactions.

There are also checks at the data field level, too, looking for listings that have, for example, 30,000 bedrooms. Zillow also flags certain types of transactions such as foreclosures, because these deals are not used in the Zestimate calculations.

Zillow's technology platform includes Apache Spark. The company also uses Redis and Python for real-time scoring. Zillow taps AWS S3 for cloud storage and relies on AWS Redshift and Presto for its data warehouse. Thind said Zillow specifically turns to Presto when looking at historical data.

Beyond the Zestimate, Zillow provides other numbers to its audience, too, such as a Turbo Zestimate, and a "hot homes" designation (which predicts how fast a home will sell). Many of these figures are based on Zillow's Zestimate calculation.

Zillow has also invested in predicting the preferences of its consumer users through personalization and search. Thind said Zillow uses different kinds of user vectors depending upon how sparse the signals are for a particular user.  

Users who share their email address with Zillow can get recommendations for homes they would like, based on what they've searched for in the past. Zillow may also send these users personalized collections of homes based on what factors seem important to the users, such as good school districts.

For the data pros in the audience, Zillow offers a special gift. The company publishes a small selection of data sets on its website that users can download. They are at Zillow.com/data.

0 Comment

Leave a Reply

Captcha image


  • 5300c769af79e

    Asus ZenFone 3 Deluxe

    99 Asus ZenFone 3 Deluxe shows up at a perfect time to take advantage of Samsung's misfortunes.Compare Similar ProductsCompare ZTE Axon 7 %displayPrice% OnePlus 3 (Unlocked) %displayPrice% Samsung Galaxy S7 Edge %displayPrice% HTC 10 %displayPrice% Apple iPhone 7 Plus %displayPrice% Motorola Moto G4 Plus %displayPrice% Design, Features, and DisplayThe ZenFone 3 Deluxe lives up to its classy sounding moniker.
  • 5300c769af79e

    Netflix is now spending 5% of its content budget on original movies

    “We are investing about 5% of our cash content budget in original films,” the company said in its quarterly earnings statement for the quarter ending on March 31.” And so far Netflix is off to a strong start to 2016, with “our most ambitious slate of content releases to date” for the quarter.
  • 5300c769af79e

    Prisma Now Lets You Turn Videos Into Works of Art

    The Prisma app arrived this summer to turn your photos into works of digital art, and now it's doing the same for your videos."Turn your memories into moving artworks using unique video styles," the App Store description says.
  • 5300c769af79e

    Sleep Number it bed

    You can track sleep with a number of fitness trackers and smartwatches, but Sleep Number is getting right to the heart of the matter—your bed.The company's it bed features built-in sensors to monitor sleep quality, which you can view on an accompanying app.