A company regularly receives new training data from a vendor of an ML model. The vendor deliverscleaned and prepared data to the companys Amazon S3 bucket every 3“4 days.The company has an Amazon SageMaker AI pipeline to retrain the model. An ML engineer needs torun the pipeline automatically when new data is uploaded to the S3 bucket.Which solution will meet these requirements with the LEAST operational effort?
An ML engineer wants to run a training job on Amazon SageMaker AI. The training job will train aneural network by using multiple GPUs. The training dataset is stored in Parquet format.The ML engineer discovered that the Parquet dataset contains files too large to fit into the memoryof the SageMaker AI training instances.Which solution will fix the memory problem?
A company runs an ML model on Amazon SageMaker AI. The company uses an automatic processthat makes API calls to create training jobs for the model. The company has new compliance rulesthat prohibit the collection of aggregated metadata from training jobs.Which solution will prevent SageMaker AI from collecting metadata from the training jobs?
A company wants to build an anomaly detection ML model. The model will use large-scale tabulardata that is stored in an Amazon S3 bucket. The company does not have expertise in Python, Spark,or other languages for ML.An ML engineer needs to transform and prepare the data for ML model training.Which solution will meet these requirements?
A government agency is conducting a national census to assess program needs by area and city. Thecensus form collects approximately 500 responses from each citizen. The agency needs to analyzethe data to extract meaningful insights. The agency wants to reduce the dimensions of the highdimensionaldata to uncover hidden patterns.Which solution will meet these requirements?