Wednesday, June 26, 2019

Azure batch service
















Azure batch is large scale parallel and high performance computing jobs efficiently across the infra structure. to create a Batch account, a pool of compute nodes (virtual machines), and a job that runs basic tasks on the pool .There is no cluster or job scheduler software to install, manage, or scale. Instead, you use Batch APIs and tools, command-line scripts, or the Azure portal to configure, manage, and monitor your jobs.


Batch works well with intrinsically parallel (also known as "embarrassingly parallel") workloads. Intrinsically parallel workloads are those where the applications can run independently, and each instance completes part of the work. When the applications are executing, they might access some common data, but they do not communicate with other instances of the application. Intrinsically parallel workloads can therefore run at a large scale, determined by the amount of compute resources available to run applications simultaneously.

Some examples of intrinsically parallel workloads you can bring to Batch:
  • Financial risk modeling using Monte Carlo simulations
  • VFX and 3D image rendering
  • Image analysis and processing
  • Media transcoding
  • Genetic sequence analysis
  • Optical character recognition (OCR)
  • Data ingestion, processing, and ETL operations
  • Software test execution



Steps to create Azure batch

We have many ways to create azure batch jobs like through powershell , azure portal, through programming languages like python, .net etc, here i am providing the steps through azure portal.

1. First login to the azure portal and create a batch account .

Select Create a resource > Compute > Batch Service.











2. Now create the batch account. provide the subscription details ( here i have the free account ) and create the resource group also (resource group should be unique)















3. Now click on the resource group  and create a pool of compute nodes 














4. Click on pools and configure as below . This configuration will create the VM's for the batch job. I have selected the centOS 7.6 as the flavor . Also you have an option to create a fixed node or autoscaled group according to the requirement . I have created a pool called batch1 as below 













5. Now we need to configure the storage account for the batch . This is really needed when we have to publish the data as output to a blob storage or accepting the input from the storage resource . It may take some time to create the storage resource .

























6. After successfully creating the storage account i have mapped that to the batch account as below 











7. Now we have to create a job from the batch account as below 













8. Let's execute the batch job using the python SDK. Download the python job from the git ( this is a simple program to quick start batch job)

unixchips@unixchips:~$ git clone https://github.com/Azure-Samples/batch-python-quickstart.git
Cloning into 'batch-python-quickstart'...
remote: Enumerating objects: 103, done.
remote: Total 103 (delta 0), reused 0 (delta 0), pack-reused 103
Receiving objects: 100% (103/103), 33.56 MiB | 1.79 MiB/s, done.
Resolving deltas: 100% (45/45), done.
Checking connectivity... done.


9. Once you clone the app we can see many files inside the directory, let's make the enviroenments ready by installing the requirements.txt file

unixchips@unixchips:~/batch-python-quickstart/src$ ls -lrt
total 28
drwxrwxr-x 2 unixchips unixchips  4096 Jun 26 20:42 InputFiles
-rw-rw-r-- 1 unixchips unixchips    44 Jun 26 20:42 requirements.txt
-rw-rw-r-- 1 unixchips unixchips  1385 Jun 26 20:42 config.py
-rw-rw-r-- 1 unixchips unixchips 14923 Jun 26 20:42 batch_python_tutorial_ffmpeg.py
unixchips@unixchips:~/batch-python-ffmpeg-tutorial/src$




unixchips@unixchips:~/batch-python-ffmpeg-tutorial/src$ pip install -r requirements.txt
Collecting azure-batch==6.0.0 (from -r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/f0/17/a9e17e280769cc1da2b3fc4db8b8074adf4e1676c1b7487bf3fbef51142f/azure_batch-6.0.0-py2.py3-none-any.whl (214kB)
    100% |████████████████████████████████| 215kB 721kB/s
Collecting azure-storage-blob==1.4.0 (from -r requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/f7/b7/9b20c39bf411e896d110d01f2551e6e7b397fde6eb06b07293fe29705d13/azure_storage_blob-1.4.0-py2.py3-none-any.whl (75kB)
    100% |████████████████████████████████| 81kB 1.1MB/s
Collecting msrest>=0.5.0 (from azure-batch==6.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/56/1d/a5947aba03169aeccd2d6539a3986614d133be53adac028ec3d6a1a285a4/msrest-0.6.8-py2.py3-none-any.whl (82kB)
    100% |████████████████████████████████| 92kB 1.3MB/s
Collecting azure-nspkg; python_version < "3.0" (from azure-batch==6.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/c2/95/af354f2f415d250dafe26a5d94230558aa8cf733a9dcbf0d26cd61f5a9b8/azure_nspkg-3.0.2-py2-none-any.whl
Collecting azure-common~=1.1 (from azure-batch==6.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/00/55/a703923c12cd3172d5c007beda0c1a34342a17a6a72779f8a7c269af0cd6/azure_common-1.1.23-py2.py3-none-any.whl
Collecting msrestazure<2.0.0,>=0.4.32 (from azure-batch==6.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/0a/aa/b17a4f702ecd6d9e989ae34109aa384c988aed0de37215c651165ed45238/msrestazure-0.6.1-py2.py3-none-any.whl (40kB)
    100% |████████████████████████████████| 40kB 2.3MB/s
Collecting futures; python_version < "3.0" (from azure-storage-blob==1.4.0->-r requirements.txt (line 2))
  Using cached https://files.pythonhosted.org/packages/2d/99/b2c4e9d5a30f6471e410a146232b4118e697fa3ffc06d6a65efde84debd0/futures-3.2.0-py2-none-any.whl
Collecting azure-storage-common~=1.4 (from azure-storage-blob==1.4.0->-r requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/05/6c/b2285bf3687768dbf61b6bc085b0c1be2893b6e2757a9d023263764177f3/azure_storage_common-1.4.2-py2.py3-none-any.whl (47kB)


10. configure the batch account details in requirements.txt as below

unixchips@unixchips:~/batch-python-ffmpeg-tutorial/src$ cat requirements.txt
azure-batch==6.0.0
azure-storage-blob==1.4.0unixchips@unixchips:~/batch-python-ffmpeg-tutorial/src$


_BATCH_ACCOUNT_NAME ='unixchipsazure'
_BATCH_ACCOUNT_KEY = '************************69XOrlnGgPnx07PAqL1mqlM43xgz2CPmrUlB0Y7mJy256pEw=='
_BATCH_ACCOUNT_URL = 'https://unixchipsazure.westindia.batch.azure.com'
_STORAGE_ACCOUNT_NAME = '******storage1'
_STORAGE_ACCOUNT_KEY = '**************************iKSOnV0gBGE69XOrlnGgPnx07PAqL1mqlM43xgz2CPmrUlB0Y7mJy256pEw=='
_POOL_ID = 'LinuxFfmpegPool'
_DEDICATED_POOL_NODE_COUNT = 0
_LOW_PRIORITY_POOL_NODE_COUNT = 5
_POOL_VM_SIZE = 'STANDARD_A1_v2'
_JOB_ID = 'LinuxFfmpegJob'


11. execute the python script as below which will create the input files, pools and output files

unixchips@unixchips:~/batch-python-quickstart/src$ python python_quickstart_client.py
Sample start: 2019-06-27 13:36:28

Uploading file /home/unixchips/batch-python-quickstart/src/taskdata0.txt to container [input]...
Uploading file /home/unixchips/batch-python-quickstart/src/taskdata1.txt to container [input]...
Uploading file /home/unixchips/batch-python-quickstart/src/taskdata2.txt to container [input]...
Creating pool [PythonQuickstartPool]...
Creating job [PythonQuickstartJob]...
Adding 3 tasks to job [PythonQuickstartJob]...
Monitoring all tasks for 'Completed' state, timeout in 0:30:00....................................................................................................................................
  Success! All tasks reached the 'Completed' state within the specified timeout period.
Printing task output...
Task: Task0
Node: tvmps_1ce23525576288d68eb056920c5e0f879e8c995ead4e5c5ebd400566ae41d15c_d
Standard output:
With support for Linux, Windows Server, SQL Server, Oracle, IBM, and SAP, Azure Virtual Machines gives you the flexibility of virtualization for a wide range of computing solutions—development and testing, running applications, and extending your datacenter. It’s the freedom of open-source software configured the way you need it. It’s as if it was another rack in your datacenter, giving you the power to deploy an application in minutes instead of weeks.

It’s all about choice for your virtual machines. Choose Linux or Windows. Choose to be on-premises, in the cloud, or both. Choose your own virtual machine image or download a certified pre-configured image in our marketplace. With Virtual Machines, you’re in control.

Combine the performance of a world-class supercomputer with the scalability of the cloud. Scale from one to thousands of virtual machine instances. Plus, with the growing number of regional Azure datacenters, easily scale globally so you’re closer to where your customers are.

Keep your budget in check with low-cost, per-minute billing. You only pay for the compute time you use.

We’ll help you encrypt sensitive data, protect virtual machines from viruses and malware, secure network traffic, and meet regulatory and compliance requirements.
Task: Task1
Node: tvmps_4c74455e36026e69069e218c090a165119d9588ef180038ca8cef7cd76bfd709_d
Standard output:
Batch processing began with mainframe computers and punch cards. Today it still plays a central role in business, engineering, science, and other pursuits that require running lots of automated tasks—processing bills and payroll, calculating portfolio risk, designing new products, rendering animated films, testing software, searching for energy, predicting the weather, and finding new cures for disease. Previously only a few had access to the computing power for these scenarios. With Azure Batch, that power is available to you when you need it, without any capital investment.

Choose the operating system and development tools you need to run your large-scale jobs on Batch. Batch provides a consistent job scheduling and management experience whether you select Windows Server or Linux compute nodes, but lets you take advantage of the unique features of each environment. With Windows, use your existing Windows-based code, including .NET, to run large-scale compute jobs in Azure. With Linux, choose from popular distributions including CentOS, Ubuntu, and SUSE Linux Enterprise Server to run your compute jobs, or use Docker containers to lift and shift your applications. Batch provides SDKs and supports a range of development tools including Python and Java.

Batch runs the applications that you use on workstations and clusters today. It’s easy to cloud-enable your executables and scripts to scale out. Batch provides a queue to receive the work that you want to run and executes your applications. Describe the data that need to be moved to the cloud for processing, how the data should be distributed, what parameters to use for each task, and the command to start the process. Think about this like an assembly line with multiple applications. Batch makes it easy to share data between steps and manage the execution as a whole.


******************<output is omitted>*******************

12 click on the batch account and poold we can see the heat map as below
















13. The output will display in the screen with created node details for the job and the contents of the sample file

taskdata0.txt
taskdata1.txt
taskdata2.txt

output

Success! All tasks reached the 'Completed' state within the specified timeout period.
Printing task output...
Task: Task0
Node: tvmps_1ce23525576288d68eb056920c5e0f879e8c995ead4e5c5ebd400566ae41d15c_d
Standard output:
With support for Linux, Windows Server, SQL Server, Oracle, IBM, and SAP, Azure Virtual Machines gives you the flexibility of virtualization for a wide range of computing solutions—development and testing, running applications, and extending your datacenter. It’s the freedom of open-source software configured the way you need it. It’s as if it was another rack in your datacenter, giving you the power to deploy an application in minutes instead of weeks.

It’s all about choice for your virtual machines. Choose Linux or Windows. Choose to be on-premises, in the cloud, or both. Choose your own virtual machine image or download a certified pre-configured image in our marketplace. With Virtual Machines, you’re in control.

Combine the performance of a world-class supercomputer with the scalability of the cloud. Scale from one to thousands of virtual machine instances. Plus, with the growing number of regional Azure datacenters, easily scale globally so you’re closer to where your customers are.

Keep your budget in check with low-cost, per-minute billing. You only pay for the compute time you use.

We’ll help you encrypt sensitive data, protect virtual machines from viruses and malware, secure network traffic, and meet regulatory and compliance requirements.
Task: Task1
Node: tvmps_4c74455e36026e69069e218c090a165119d9588ef180038ca8cef7cd76bfd709_d
Standard output:
Batch processing began with mainframe computers and punch cards. Today it still plays a central role in business, engineering, science, and other pursuits that require running lots of automated tasks—processing bills and payroll, calculating portfolio risk, designing new products, rendering animated films, testing software, searching for energy, predicting the weather, and finding new cures for disease. Previously only a few had access to the computing power for these scenarios. With Azure Batch, that power is available to you when you need it, without any capital investment.

Choose the operating system and development tools you need to run your large-scale jobs on Batch. Batch provides a consistent job scheduling and management experience whether you select Windows Server or Linux compute nodes, but lets you take advantage of the unique features of each environment. With Windows, use your existing Windows-based code, including .NET, to run large-scale compute jobs in Azure. With Linux, choose from popular distributions including CentOS, Ubuntu, and SUSE Linux Enterprise Server to run your compute jobs, or use Docker containers to lift and shift your applications. Batch provides SDKs and supports a range of development tools including Python and Java.

Batch runs the applications that you use on workstations and clusters today. It’s easy to cloud-enable your executables and scripts to scale out. Batch provides a queue to receive the work that you want to run and executes your applications. Describe the data that need to be moved to the cloud for processing, how the data should be distributed, what parameters to use for each task, and the command to start the process. Think about this like an assembly line with multiple applications. Batch makes it easy to share data between steps and manage the execution as a whole.

You use a workstation today, maybe a small cluster, or you wait in a queue to run your jobs. What if you had access to 16 cores, 100 cores, 10,000 cores, or even 100,000 cores when you needed them, and only had to pay for what you used? With Batch you can. Avoid the bottlenecks and waiting that limit your imagination. What could you do on Azure that you can’t do today?
Task: Task2
Node: tvmps_4c74455e36026e69069e218c090a165119d9588ef180038ca8cef7cd76bfd709_d
Standard output:
Azure Storage offers a set of storage services for all your business needs. Choose from Blob Storage (Object Storage) for unstructured data, File Storage for SMB-based cloud file shares, Table Storage for NoSQL data, Queue Storage to reliably store messages, and Premium Storage for high-performance, low-latency block storage for I/O-intensive workloads running in Azure Virtual Machines.

Storage keeps pace with your growing data needs, delivering petabytes of storage for the largest scenarios. Whether you're building modern applications or a high-scale big data application, Storage can handle it.

Storage is available in more regions than any other public cloud offering, letting you store your data where it makes the most business sense. Scale up or across data centers as needed, and be closer to your customers for faster access and better performance.

Storage automatically replicates your data and maintains multiple copies—either in a single region or globally with geo-redundancy—to help guard against unexpected hardware failures.


we have succesfully executed batch service in azure .

Thank you for reading & sharing 

No comments:

Post a Comment