CloudCar’s Senior Product Manager Seema Khandkar was among the thought leaders who spoke at the annual Global Big Data Conference in Santa Clara, California. Seema blogs about CloudCar’s journey establishing an analytics and machine learning platform and our three key takeaways.
I recently had the pleasure of speaking at the Global Big Data Conference on August 30th and representing my company, CloudCar. The talk was about the challenges we faced as a small business as we made our foray into machine learning as a core offering of CloudCar’s cloud-based infotainment platform. It was an interactive session with a lot of interest from the audience in CloudCar’s products and the machine learning components with questions around how we handled GDPR compliance initiatives.
CloudCar is building its next generation platform that understands and anticipates drivers’ needs. We like to call it a digital assistant for the car. With connected cars entering mainstream consumer psyche, drivers expect their vehicles to be as smart or even smarter than their mobile devices. The car’s ecosystem is more complex than a mobile device and driver safety is our primary focus at all times.
Our platform integrates content across different content providers and normalizes them into categories or what we refer to as “domains” such as Media, Places, Productivity, Smart Home, etc. For example, content from TuneIn and Deezer is within the Media domain and content from Yelp and TripAdvisor is placed, no pun intended, in the Places domain. The machine learning platform leverages a horizontal cross section across domains. Its goal is to understand driver’s implicit and explicit preferences, learn from their habits in order to deliver personalized and timely content to the driver. For instance, say I listen to podcasts while driving to work on weekdays and listen to rock music on my way back home. I also like to stop at my favorite barista and pick up coffee on my way to work. CloudCar’s personalized and predictive machine learning system learns from my driving patterns and will recommend music based on my listening habits, my destinations, and advise me about detours and delays on my usual routes.
Data forms the core of any machine learning platform. Building the data collection processes, standardizing data across products and normalizing the data so that it can be used for machine learning was one of our very first challenges. From there on, establishing credibility around complex algorithms and training models as well as proving their efficacy was the next hurdle. We spent significant time ensuring appropriate security measures and policies were put in place to achieve GDPR compliance.
Summarized below are three key takeaways through our journey:
- A data first approach: Data is a currency and should be at the forefront of product development cycles. Large companies such as Google, Amazon are winning big by collecting data, gaining insights and hence always remaining at the forefront of the next wave of user needs. It is always hard for a smaller company to balance the strategic versus the tactical. Data must be seen as a priority to ensure sufficient forward leverage and relevance. Not doing this will bite you later and its worth spending the extra time and resources up front.
- Data definition and collection: We did an inventory of all the data elements we were capturing and whether the results were personally identifiable information (PII). The data is collected at the user interface (UI) level and flows through the data pipelines all the way to our Big Data systems which requires data ownership at all levels of the stack. Once our team had the inventory of data elements, we combed through them to see if we had the business justification for collecting specific information and established the data expiration policies. If we didn’t need some data, we didn’t keep it.
- Establishing credibility: In order to establish credibility, we focused on core driver use cases. Creating a niche helps and developing the core use cases, going deep rather than going broad helps compete without needing to boil the ocean. Our team also explained how our algorithms worked through examples so that automotive OEMs and ultimately the end user (the driver) understands why some content or notifications are being triggered. The more transparent we are, the more trust we build.
I consider myself lucky to have been part of this journey at CloudCar as I have gained a wealth of experience. It’s a very exciting time here and each day is spent refining and adding to CloudCar’s machine learning platform and breaking new ground in the connected automobile space.
Past blog written by Seema Khandkar: