AWS Big Data Blog
598 followers 4 articles/week
Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed orchestration service for Apache Airflow that you can use to set up and operate data pipelines in the cloud at scale. Apache Airflow is an open source tool used to programmatically author, schedule, and monitor sequences of processes and tasks, referred to as workflows. With Amazon...

Thu Apr 25, 2024 20:13
Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

In the era of data, organizations are increasingly using data lakes to store and analyze vast amounts of structured and unstructured data. Data lakes provide a centralized repository for data from various sources, enabling organizations to unlock valuable insights and drive data-driven decision-making. However, as data volumes continue to grow, optimizing...

Thu Apr 25, 2024 20:13
Run interactive workloads on Amazon EMR Serverless from Amazon EMR Studio

Starting from release 6.14, Amazon EMR Studio supports interactive analytics on Amazon EMR Serverless. You can now use EMR Serverless applications as the compute, in addition to Amazon EMR on EC2 clusters and Amazon EMR on EKS virtual clusters, to run JupyterLab notebooks from EMR Studio Workspaces. EMR Studio is an integrated development environment...

Wed Apr 24, 2024 20:09
Dynamic DAG generation with YAML and DAG Factory in Amazon MWAA

Amazon Managed Workflow for Apache Airflow (Amazon MWAA) is a managed service that allows you to use a familiar Apache Airflow environment with improved scalability, availability, and security to enhance and scale your business workflows without the operational burden of managing the underlying infrastructure. In Airflow, Directed Acyclic Graphs (DAGs)...

Mon Apr 22, 2024 21:07
How Salesforce optimized their detection and response platform using AWS managed services

This is a guest blog post co-authored with Atul Khare and Bhupender Panwar from Salesforce. Headquartered in San Francisco, Salesforce, Inc. is a cloud-based customer relationship management (CRM) software company building artificial intelligence (AI)-powered business applications that allow businesses to connect with their customers in new and personalized...

Thu Apr 18, 2024 21:31
Amazon OpenSearch Service Under the Hood : OpenSearch Optimized Instances(OR1)

Amazon OpenSearch Service recently introduced the OpenSearch Optimized Instance family (OR1), which delivers up to 30% price-performance improvement over existing memory optimized instances in internal benchmarks, and uses Amazon Simple Storage Service (Amazon S3) to provide 11 9s of durability. With this new instance family, OpenSearch Service uses...

Wed Apr 17, 2024 18:26

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account