We offer a proven, integrated end to end Big Data solution, based on years of innovation at Google, that lets you capture, process, store and analyze your data within a single platform. With Google Cloud Platform you can focus on finding insights rather than managing your infrastructure and you can combine cloud-native services with open source tools as needed, both in batch and stream mode.
Google BigQuery is Google's fully managed, low cost analytics data warehouse. BigQuery is serverless, there is no infrastructure to manage, no need to guess the needed capacity or overprovision, and you don't need a database administrator. You can focus on analyzing data to find meaningful insights, use familiar SQL, and take advantage of our pay-as-you-go model. BigQuery is a powerful Big Data analytics platform used by all types of organizations, from startups to Fortune 500 companies.
Google Cloud Dataflow offers a unified programming model and a managed service for executing a wide range of data processing patterns including streaming analytics, ETL, and batch computation. Cloud Dataflow frees you from operational tasks like capacity planning, resource management and performance optimization.
Use Google Cloud Dataproc, a managed Spark and Hadoop service, to easily process big datasets using the powerful and open tools in the Apache big data ecosystem. Control your costs by creating managed clusters of any size in about a minute, and turning them off when you're done, paying for what you use, not idle clusters. Cloud Dataproc integrates with storage, compute, and monitoring services across Cloud Platform products, giving you a powerful and complete data processing platform.
Google Cloud Composer is a fully managed workflow orchestration service that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers. Built on the popular Apache Airflow open source project and operated using the Python programming language, Cloud Composer is free from lock-in and easy to use.
Google Cloud Datalab is an interactive notebook (based on Jupyter) to explore, collaborate, analyze and visualize data. It is integrated with BigQuery and Google Cloud Machine Learning to give you easy access to key data processing services.
Google Data Studio turns data into dashboards and reports that are easy to read, share, and customize.
Google Cloud Dataprep is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis. Cloud Dataprep is serverless and works at any scale. There is no infrastructure to deploy or manage. Easy data preparation with clicks and no code.
Google Cloud Pub/Sub is a serverless, large scale, reliable, real-time messaging service that allows you to send and receive messages between independent applications. You can leverage Cloud Pub/Sub’s flexibility to decouple systems and components hosted on Cloud Platform or elsewhere on the Internet. By building on the same technology Google uses, Cloud Pub/Sub is designed to provide “at least once” delivery at low latency with on-demand scaling to tens of millions of messages per second.
From ingestion to data preparation, store, and analysis, Cloud Platform provides a suite of serverless services which free you from the need to deploy and operate clusters, or to guess the amount of resources needed ahead of time. Combine Cloud-native data processing services with the best of open source to easily manage data and benefit from it, today.
Get started using Google Cloud Big Data Products
Start analyzing TBs of data directly from your web browser with BigQuery’s web UI.
Deep dives, technical comparisons, how-to's and tips and tricks for using the latest data processing technologies.
Check out how to create a cluster and run a simple Spark job in Cloud Dataproc.
Simplify large-scale data processing with the Dataflow programming model.
See some of the talks that our customers gave at Next on how they use of Google Cloud data services.
Learn about Apache Beam, the open source version of Cloud Dataflow which is portable to Apache Spark and Apache Crunch.