spark apache -
https://spark.apache.org/docs/latest/api/python/getting_started/index.html
databricks:
https://www.databricks.com
https://www.databricks.com/solutions/industries/retail-industry-solutions
Top Databricks Data Intelligence Platform Alternatives
- SQL Server.
- MongoDB Atlas.
- Oracle Database.
- Teradata VantageCloud.
- Amazon Redshift.
- SAP HANA Cloud.
- Google BigQuery.
- Snowflake Data Cloud.
What is Databricks used for?
Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models. Recently added to Azure, it's the latest big data tool for the Microsoft cloud.
Azure
is a cloud computing platform
The Azure cloud platform is more than 200 products and cloud services designed to help you bring new solutions to life—to solve today's challenges and create the future. Build, run, and manage applications across multiple clouds, on-premises, and at the edge, with the tools and frameworks of your choice.
Azure Data Lake
DescriptionAzure Data Lake is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud. Wikipedia
MongoDB Atlas Data Lake
An integrated analytics data store that won't impact the performance of your application.
ETL
ETL code is the set of scripts or programs that perform the data extraction, transformation, and loading tasks. ETL documentation is the collection of information that describes the data sources, the data flow, the data quality, the business rules, and the expected outcomes of the ETL process.
Devops
Comments
Post a Comment