The Difference Between Real-Time, Near Real-Time, and Batch Processing in Big Data

When it comes to data processing, there are more ways to do it than ever. Your choices include real-time, near real-time, and batch processing. How you do it and the tools you choose depend largely on what your purposes are for processing the data in the first place.

In many cases, you’re processing historical and archived data and time isn’t so critical. You can wait a few hours for your answer, and if necessary, a few days. Conversely, other processing tasks are crucial, and the answers need to be delivered within seconds to be of value.

Real-time, near real-time, and batch processing

Type of data processing
When do you need it?

Real-time
When you need information processed immediately (such as at a bank ATM)

Near real-time
When speed is important, but you don’t need it immediately (such as producing operational intelligence)

Batch
When you can wait for days (or longer) for processing (Payroll is a good example.)

 

What is real-time processing and when do you need it?

Real-time processing requires a continual input, constant processing, and steady output of data.

A great example of real-time processing is data streaming, radar systems, customer service systems, and bank ATMs, where immediate processing is crucial to make the system work properly. Spark is a great tool to use for real-time processing.

Examples of real-time processing:

Data streaming
Radar systems
Customer service systems
Bank ATMs

Read our Whitepaper

Easy, Automated, Real-time Data Sharing: What Can It Do For Your Business?

Learn how you can overcome the challenges to real-time, automated data sharing.

What is near real-time processing and when do you need it?

This processing is when speed is important, but processing time in minutes is acceptable in lieu of seconds.

An example of this processing is the production of operational intelligence, which is a combination of data processing and Complete Event Processing (CEP). CEP involves combining data from multiple sources in order to detect patterns. It’s useful for identifying opportunities in the data sets (such as sales leads) as well as threats (detecting an intruder in the network).

Operational intelligence, or OI, should not be confused with Operational business intelligence, or OBI, which involves the analysis of historical and archived data for strategic and planning purposes. It is not necessary to process OBI in real time or near-real time.

Examples of near real-time processing:

Processing sensor data
IT systems monitoring
Financial transaction processing

What is batch processing and when do you need it?

Batch processing is even less time-sensitive than near real-time. In fact, batch processing jobs can take hours, or perhaps even days.

Batch processing involves three separate processes. First, data is collected, usually over a period of time. Second, the data is processed by a separate program. Thirdly, the data is output. Examples of data entered in for analysis can include operational data, historical and archived data, data from social media, service data, etc.

MapReduce is a useful tool for batch processing and analytics that doesn’t need to be real time or near real-time, because it is incredibly powerful.

Examples of uses for batch processing include payroll and billing activities, which usually occur on monthly cycles, and deep analytics that are not essential for fast intelligence necessary for immediate decision making.

Examples of batch processing:

Payroll
Billing
Orders from customers

Precisely can help keep your data fresh. Precisely Connect continually keeps data in your analytics platforms in sync with changes made on the mainframe, so the most current information is available in the data lake for analytics.

To learn more, read our whitepaper: Easy, Automated, Real-Time Data Sharing: What Can It Do For Your Business? 

The post The Difference Between Real-Time, Near Real-Time, and Batch Processing in Big Data appeared first on Precisely.

GitHub to IBMi: An incredible journey of RPG source

Another blog post on the journey of IBMi modernization for code management. We will cover source hosting on GitHub, moving the source from GitHub to IBMi, and compiling all the sources.
As we all must know that on IBMi we can natively do centralized version control. This is why a
source member opened by one user in edit more cannot be edited by another user which makes sense because we don’t want people to change the same source simultaneously. But with distributed version control (DVC) system more than one user can work on the same source by either making a clone into a local machine or creating a separate branch and working on it. DVC system can help in better source management and collaborations between teams. We will cover more in detail please read on and do try to implement.

About GitHub
About Git
GitHub features
a new project on GitHub
clone to IFS and run make file? use a custom program to clone the repo and do a makefile run(time consuming).
Make changes on IFS, and push changes to git.
PR, issues, branches, tags, release

GitHub

GitHub definition as taken from wikipedia.

GitHub, Inc., is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous integration, and wikis for every project.

Git

Git is a free and open-source distributed version control system designed to handle everything from small to huge projects with speed and efficiency. The Git feature that really makes it stand apart from nearly every other SCM out there is its branching model. More can be read here about git on i. Please do note that it is not mandatory to use GitHub with Git, we can do git version control locally as well but the convenience of GitHub makes it easy to use it.

Using GitHub

Though there are a lot of articles, videos, and tutorials available educating about the use of GitHub none of them is IBMi specific which we will be covering here. Using IBMi conventions will make it easier for readers to understand the GitHub features and how can they help in native source management. To make it easier for understanding we will be starting a new project as a reference instead of using existing IBMi sources/projects which we will be covering in future articles.

Sign in/Sign up for GitHub which is as easy as creating any social media account.

Some GitHub terms to familiarize with –

Repository – The literal meaning is a place or container in which something is stored in large quantities and the meaning for our purpose is a folder where all of our projects files and the change history is stored in GitHub.
Branch – When a new repository is created it gets a default branch usually named main in GitHub. By default, all changes you make in the repository are on the main branch. Using GitHub one can create multiple branches easily. When a new branch is created in the repository it is typically based on the default branch unless specified otherwise. That means the main branch is copied within the same repository as a new branch. Branching is a way of isolating changes from the main branch and merging them with the main branch when all the changes are ready. The below image copied from GitHub docs will give you an idea of why branching is important. As a convenience, the main or default branch should be used as the base/production branch which means all the code running in production must go from the main/default branch only.

Pull Request – When a developer is done with his code changes and testing, the changes now must go to the main/default branch to be later deployed to production. But before that, a code review is always done which can be requested by the changing developer by making a pull request from his branch to the main branch. This simply means the developer is requesting another person to review his changes and if everything looks good to him then pull his changes from his feature branch to the main branch thus the name pull request. GitHub very neatly shows all the code changes done by the developer on the feature branch. The reviewer can comment on changes or approve if all looks good and then merge the changes into the main branch.
Clone – A clone of a repository means a copy of the complete repository into another machine. Usually, private repositories will need some setup to do that due to security reasons. Also, git should be installed on the machine where the repository has to be cloned.
README.md – A file that usually resides at the root of the project and is used to write information on the project stored in the repository this is to help any new developer who will be using the repository. A special writing format is used in this file which is Markdown format.

There is a lot that can be discussed but since this blog only covers the parts well to get started we have enough. For detailed reading, one can always refer to the GitHub official docs.

Create a new repository with a name of your choice.

Add a new file to the repository

Congratulations !! you have your first file on GitHub.

Setup git on IBMi

This can be done by following the steps mentioned in my previous blog. how-to-set-up-git-on-your-machine

Getting source from GitHub

Since you have your repository and git setup on IBMi done the repository can now be brought on the IBMi from GitHub by conning the repository i.e. fetching the data from GitHub to IBMi using git clone repo-link command e.g git clone https://github.com/IBM/xmlservice.git.

Since we have created a public repository we don’t need to provide any authentication for this step but if we have a private repository then we need to do an ssh key generation setup for authentication. The generated key on the IBMi will be copied to our GitHub account as well. Use this very simple guide from GitHub ssh-keygen.

Since we do these actions on the IBMi PASE terminal we will be the IFS folder and the repository will be cloned in the path we execute the clone command. If you want to store in a different path then before executing the git clone command change your path by cd /new/path/github/mylibrary.

Once the repository is cloned you will see a folder created with the repo. name in your path. If you go into that folder you can find all your files.

Now a big question might be how to use these sources to build objects on the i. Simply by using MAKEFILE. More about makefile and its use in the next blog. By the time you try compiling each source form the terminal using system command system ” CRTBNDPGM ……….FROMIFS(/path/to/ifs)”.

Scott’s Corner – %SPLIT

Tip of the Month

I’m pleasantly surprised about how easy it has become to work with delimited strings in RPG with the recent enhancements from IBM. For splitting a delimited string into fields, they added the %SPLIT built-in function in April of 2021. It works like this:

In this example, record is a variable containing data from a delimited file (in this example, the data is delimited by the pipe character). The %SPLIT built-in breaks it up into fields and puts each one in a separate array element, in this case in an array named arr.

Now that it has been split, it’s easy to convert the data to something that you can use in a database!

As I write this, %CONCATARR has been announced but is not yet available. I expect to see PTFs to make it available in December 2022 – and I can’t wait!

 

Scott Klement
Midrange Dynamics Development & Solutions Architect

Scott Klement is an IT professional with a passion for both programming and mentoring. He joined Midrange Dynamics at the beginning of October 2022. He formerly was the Director of Product Development and Support at Profound Logic and the IT Manager and Senior Programmer at Klement’s Sausage Co., Inc. Scott also serves on the Board of Directors of COMMON, where he represents the Education, Innovation, and Certification teams. He is an IBM Champion for Power Systems.

Subscribe to our newsletter and join us next month to see what is happening in Scott’s Corner. Add a great dad joke to your arsenal and gain an even better IT insight from this recognized industry expert as he continues his quest to educate and support the IBM i community. 

The post Scott’s Corner – %SPLIT appeared first on Midrange Dynamics.

For the next three days, we have the COMMON Europe and COMMON (North America) Advisory Councils jointly meeting with us on the future of #IBMi – it’s an excellent opportunity! @COMMONug @CommonEurope #IBMPower

For the next three days, we have the COMMON Europe and COMMON (North America) Advisory Councils jointly meeting with us on the future of #IBMi – it’s an excellent opportunity! @COMMONug @CommonEurope #IBMPower

– Steve Will (@Steve_Will_IBMi)11:03 – Nov 14, 2022

Verified by MonsterInsights