Watch this recording and discover what you can do with IBM i.
Node.js Reference Architecture: Logging in Node.js
Understanding what tools to use for logging in your Node.js applications and deployments is a critical step to ensuring they run well in production. This article discusses why the Node.js Reference Architecture recommends Pino as a great tool for logging in your Node.js application. The article concludes by walking you through an example of logging in a Node.js application in production.
Because the Node.js ecosystem offers so many packages in the npm registry, the Node contributors at Red Hat and IBM collaborated to create a reference architecture for Node.js where we highlight our recommendations based on our use. Read our intro Welcome to the Node.js Reference Architecture. It’s important that we stress that this recommendation is only meant to be a starting point for teams who want an opinion. There are other good logging options, and we don’t think teams need to change what they are already using in production and understand there can be good reasons to use something else.
Defining requirements for a logging tool
Before we could decide on what tool to recommend, we had to define the requirements for a good logging tool. We quickly agreed that the logger we chose should be:
Able to support structured logging/JSON format
Simple. With the availability of modern log repositories and viewers, we wanted to select a simple logger and recommend doing any complex filtering/handling outside of the Node.js process itself.
Actively maintained. This might seem obvious, but in this process, we realized that one team was using a logger that was no longer maintained. That’s certainly a reminder to keep an eye on your dependencies!
Why we chose Pino for logging
When we looked at the loggers the team members had experience with, there were a number of good options. We agreed on Pino for the following reasons:
Large-scale deployment success. Some of our production Node.js deployments already used Pino, so we have seen it succeed in real-world, large-scale deployments.
Sample and starter kit support. We already used Pino in a number of our samples and starters, so we were confident in its use for other teams.
Ease of use. Of the choices which supported Structured/JSON logging, Pino was the easiest for developers to start with. Although other options had fuller features, our belief that the logging option should be simple made us choose Pino instead of tools that might have features that aren’t needed.
Logging guidance
Our team decided on these basic guidance for logging errors in Node.js.
Send logs to standard out (Stdout). Use log aggregation tools to collect logs across processes, containers, and applications. All log processing should be done in a separate process.
Plan for a final resting place for log data. After 7 days, migrate log data to a lower cost storage. Manual retrieval is likely okay.
Add code to your application that allows logger.level to be set through an environment variable so that can be easily changed in container environments.
Use the redaction option to ensure sensitive data is not logged.
Limit the use of warn, error, and fatal levels to information which must always be logged.
Limit the use of info level to important information which can always be logged without introducing significant overhead. Customer-usage REST response logs (status/duration/URL) are recommended at this level for security auditing purposes.
Don’t throw and catch multiple errors. Throw once and catch/log at the highest level.
Real-world implementation of logging guidance
How do all these recommendations translate to real-world implementation? While the reference architecture sticks to concise recommendation, in this series, we want to expand a bit on the “why” behind our decision and to offer real-world examples of Node.js usage key concepts. This section details an IBM team’s implementation of these logging principles.
At this point Jim will share some of his more detailed experience from the IBM team that he is part of. Hopefully you’ll see how this perspective (along with others) results in the guidance and recommendation above.
Structured logging
A good logging strategy ensures that your logs are structured as carriage-return separated JSON blocks. This makes it easy for modern log repositories and viewers like LogDNA, Kibana, and Loggly to sort and filter various keywords in the JSON block. Even though modern logging repositories will turn string-based logs into JSON logs internally, providing JSON logs allows your applications to define their own custom keywords to sort/search with. It’s still common for shell scripts at the startup of a container to not use JSON logs, but to have a Node.js process in the same container use JSON logging after startup.
Development team sometimes push back against this because logs become harder to read when looking at them in a log file. A workaround we use is to use a log formatting method to sort specific fields to come in the same order as the non-JSON format. For example, first in the serialized JSON block is the timestamp, level, then message, with the rest of the keywords following them in any order.
Modern logging repositories handle the storage, filtering, and visualization of the logs that are generated by your application. This is an example of viewing logs that have been fed by the IBM Log Analysis service into LogDNA:
Note the graphql_call and transaction_id fields you can see here will be discussed in the section on using Logging formatters and thread-local variables
Static and dynamic log-level management
Early on, you should define a standard set of log levels that application developers should use. These may differ based on the logging tool that you use, but having them well defined helps ensure you have consistent use across your team.
Your team should review all the spots an application will generate an error log to confirm that the different variables needed to debug that error path are logged. Doing so helps you avoid needing to recreate the problem later with more detailed logging. Your goal should be “First Failure Data Capture”, so that you don’t have to recreate problems in order to debug customer issues. If additional data is required to debug a customer issue, a separate issue should be created to track adding additional logging to cover the additional data that was needed in that error path.
As mentioned in the guidance above, the initial application logging level is determined by environment variable setting. However, we found certain “corrupted state” cases where we wanted to change the logging level to be more verbose without having to redeploy or restart the application. This happened when we started to have occasional database or queuing-service errors. We used a hidden REST endpoint that was not exposed publicly, but we could still invoke it using admin-routing tools like “kubectl proxy. The REST endpoint could communicate with the logging library to change the logging level dynamically.
Logging formatters and thread-local variables
Logging formatters are customizable methods that allow applications to add or transform data into the outputted JSON logs. Using formatters to export specific thread-local variables into the logs is a powerful technique for getting useful data that your support teams will want in all your logs.
For example, your support team may want the ability to filter logs to a specific failing REST call. Generating and storing a REST call’s trace-guid as a specific thread-local variable early in the REST processing (i.e. one of the first express middleware layers) allows anyone following logs in that REST call to then share that same trace-guid.
Customer error messages that provide that trace-guid make it simple for your support team to help the customer by filtering on that trace-guid to know exactly what the customer was doing. Otherwise, your support team has to hunt through logs for errors with similar timestamps, which is time-consuming. There are other ways to get cross-transaction information including distributed tracing, but it’s often still useful to include this kind of information in logs.
While Node.js doesn’t have obvious thread-local variable support like Python and Ruby, it does have that functionality exposed through its async-hooks library. Specifically, in the latest Node.js versions, you can access AsyncLocalStorage. These APIs are still experimental, but have been relatively stable over time. Otherwise, if you are still using older Node.js versions, you can use the cls-hooked library even back to Node v8 to get equivalent functionality.
Good thread-local variables to set in a common authentication-validation layer are things like user-guid (as opposed to an email address for GDPR compliance), organization-id, and if the user is an application admin. Those allow security auditors to trace what actions a specific user or organization has done, and validate your own admins have not taken inappropriate actions.
If your application uses GraphQL, you will notice that since all the GraphQL calls share the same entry endpoint, traditional REST trace logs are not very helpful as they won’t indicate which GraphQL call was invoked. We updated our GraphQL entry methods to set the method name in a thread-local variable so that we could include the name in the REST trace logs.
Logging the real IP
In most Kubernetes-based deployments, the source IP address that the applications see is just that of the ingress router and not from the caller. If you use a security firewall in front of your Kubernetes cluster (like this CloudFlare service offered in the IBM Cloud, the caller’s IP address will be stored in a ‘X-Forwarded-For’ header. The application endpoint handler can then extract and log the header instead of the REST request’s IP address. If you are not using a frontend like that, you may have to do specific configuration of your load-balancer and ingress to ensure the public IP is carried into the Kubernetes ingress and then into your application.
Configuration logging at startup
Dumping the application configuration settings at startup can help developers quickly identify and understand configuration problems, as opposed to guessing what the configuration “should be.” This provides key information that will save you time when investigating issues while limiting the potential for sensitive information in subsequent logs.
However, doing that can sometimes expose values like passwords or API keys that should normally remain secret. A common example of that is where a developer adds a “printenv” at the startup of the application to dump all the environment variables and inadvertently exposes the DB passwords also passed in as environment variables.
Node.js libraries like convict make it easy to define how configuration settings are imported. They allow the app to determine if the settings should be imported from mounted config-files or environment-variables or both. They also determine what elements in a configuration are optional, what default settings should be, and if a field’s value should be treated as sensitive (i.e. value is a password or apikey). As long as developers mark the password and apikey fields as sensitive in the definition of the configuration, then the configuration object can be safely logged at startup as the sensitive field values will be converted into asterisks.
Since we mentioned sensitive information, as outlined in the Reference Architecture, it’s also important to consider the information your application is gathering and use redaction to remove sensitive fields when you are logging that gathered information.
Summary
We hope this article has given you a bit more insight and background on the recommendations and guidance in the logging section of the Node.js Reference Architecture along with a deeper dive into some of the team’s experience and what they’ve found useful. We hope you join us for the next installment in the series.
In the meantime, if you’d like to catch up on what IBM and Red Hat are doing on the Node.js front you can check out these landing pages:
IBM Developer Node.js development
Red Hat Developer Node.js development
Retrieve Job log information from SQL – IBM i
Job log
1
2
3
4
5
6
7
8
SELECT ORDINAL_POSITION,
MESSAGE_TYPE,
MESSAGE_TIMESTAMP,
FROM_LIBRARY,
FROM_PROGRAM,
MESSAGE_TEXT,
MESSAGE_SECOND_LEVEL_TEXT
FROM TABLE(QSYS2/JOBLOG_INFO(JOB_NAME => ‘*’))
1
2
3
4
5
6
7
8
9
SELECT ORDINAL_POSITION,
MESSAGE_TYPE,
MESSAGE_TIMESTAMP,
FROM_LIBRARY,
FROM_PROGRAM,
MESSAGE_TEXT,
MESSAGE_SECOND_LEVEL_TEXT
FROM TABLE(QSYS2/JOBLOG_INFO(JOB_NAME => ‘*’))
WHERE MESSAGE_TYPE = ‘DIAGNOSTIC’ ;
1
2
3
4
5
6
7
8
SELECT ORDINAL_POSITION,
MESSAGE_TYPE,
MESSAGE_TIMESTAMP,
FROM_LIBRARY,
FROM_PROGRAM,
MESSAGE_TEXT,
MESSAGE_SECOND_LEVEL_TEXT
FROM TABLE(QSYS2/JOBLOG_INFO(JOB_NAME => ‘730621/REDDYP/QPAD202846’))
1
2
3
4
5
6
7
8
SELECT MESSAGE_TYPE,
MESSAGE_TIMESTAMP,
FROM_LIBRARY,
FROM_PROGRAM,
MESSAGE_TEXT,
MESSAGE_SECOND_LEVEL_TEXT
FROM TABLE(QSYS2/JOBLOG_INFO(JOB_NAME => ‘730621/REDDYP/QPAD202846’))
WHERE MESSAGE_TYPE = ‘REQUEST’
1
2
3
4
5
6
7
8
9
SELECT MESSAGE_TYPE,
MESSAGE_TIMESTAMP,
FROM_LIBRARY,
FROM_PROGRAM,
MESSAGE_TEXT,
MESSAGE_SECOND_LEVEL_TEXT
FROM TABLE(QSYS2/JOBLOG_INFO(JOB_NAME => ‘730621/REDDYP/QPAD202846’, IGNORE_ERRORS => ‘YES’))
WHERE MESSAGE_TYPE = ‘REQUEST’
Managing IBM i: CIO challenge or opportunity?
May 4th 2021
A constant challenge for a Chief Information Officer in a transactional, data-heavy industry is leveraging the legacy of their IBM i platform to adapt to current needs, essentially using it to its fullest potential. Keeping that potential tapped translates into a flexible and agile platform, driving growth for your business, an important aspect to keep in mind when approaching the demands of IT.
Keeping the plateform alive
The trials of keeping IBM i relevant and assimilated with modern code languages, open-source tooling and user-friendly interfaces presents a unique set of challenges for the modern CIO. Under normal circumstances this is challenging enough, but add to this the necessity of remote work and the fact that businesses have had to adapt quickly to maintain resilience this last year, and we see a whole new set of challenges. With continuity at risk, all eyes turn to the CIO to maintain a stable and flexible platform with secure options for everyone.
Luckily these challenges present themselves with a particular opportunity when developing a IBM i DevSecOps strategy. The opportunity lies in finding the right tools to modernize your IBM i applications and maintain a resilient platform. Here at ARCAD we are constantly focused on creating resources that will help you to manage your IBM i applications with ease. We can help to analyze your current application portfolio, and create a path forward to a modernized, future-proof solution that retains all the security, reliability and low overhead benefits of the IBM i platform.
A resilient future
With events like COVID-19 shining a light on how business operations can change seemingly overnight, it has become clear that CIO’s should be embracing a flexible solution that encompasses application modernization for IBM i. With a deep and flexible application modernization strategy for IBM i in place, you will find that your business has gained an opportunity to be agile and to make choices that will keep your business moving forward. Integrating with open-source tools, converting to modern languages and creating a common link between disparate systems and languages is the key to longevity and resilience.
Embrace the opportunity
Using a range of the tools we’ve developed here at ARCAD, such as Transformer DB you can automate your database modernization process, while opening your applications to new technologies, ultimately propelling an increase in productivity and accuracy. Our Transformer RPG and Transformer Synon solutions both automate the mass transformation of source code to modern Free Format RPG, easing the onboarding of new development resources. Once modernized using the ARCAD transformation tools, an IBM i application becomes far easier to update and maintain. Key business logic can be exposed as Web services and the application can be restructured to interface with external applications or devices with maximum security and reliability. Learn more about our entire Modernization for IBM i suite of solutions, and prepare to future-proof your IBM i application assets with the opportunity for continued success.
The post Managing IBM i: CIO challenge or opportunity? appeared first on ARCAD Software.
According to ARCAD software the future of IBM i depends on open-source tools
According to ARCAD software the future of IBM i depends on open-source tools
The IBM i (aka iSeries, AS/400) platform has been embedded in the framework of tens of thousands of organizations for over 30 years, creating the systematic bedrock for businesses who rely on the security, dependability and low administration overhead that characterizes the platform.
As a crucial backend system, the IBM i has been kept uniquely relevant through the past decades thanks to IBM’s embracing and integrating of new open-source tools into the platform as they develop.
The post According to ARCAD software the future of IBM i depends on open-source tools appeared first on ARCAD Software.