Cloud-native developer. Distributed systems wannabe. DevOps and continuous delivery. 10x troublemaker. DevOps Manager at VHT.
12009 stories

Open-sourcing Polynote: an IDE-inspired polyglot notebook


Jeremy Smith, Jonathan Indig, Faisal Siddiqi

We are pleased to announce the open-source launch of Polynote: a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more.

Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform — which makes heavy use of Scala — with the Python ecosystem’s popular machine learning and visualization libraries. It has seen substantial adoption among Netflix’s personalization and recommendation teams, and it is now being integrated with the rest of our research platform.

At Netflix, we have always felt strongly about sharing with the open source community, and believe that Polynote has a great potential to address similar needs outside of Netflix.

Feature Overview


Polynote promotes notebook reproducibility by design. By taking a cell’s position in the notebook into account when executing it, Polynote helps prevent bad practices that make notebooks difficult to re-run from the top.

Editing Improvements

Polynote provides IDE-like features such as interactive autocomplete and parameter hints, in-line error highlighting, and a rich text editor with LaTeX support.


The Polynote UI provides at-a-glance insights into the state of the kernel by showing kernel status, highlighting currently-running cell code, and showing currently executing tasks.


Each cell in a notebook can be written in a different language with variables shared between them. Currently Scala, Python, and SQL cell types are supported.

Dependency and Configuration Management

Polynote provides configuration and dependency setup saved within the notebook itself, and helps solve some of the dependency problems commonly experienced by Spark developers.

Data Visualization

Native data exploration and visualization helps users learn more about their data without cluttering their notebooks. Integration with matplotlib and Vega allows power users to communicate with others through beautiful visualizations

Reimagining the Scala notebook experience

On the Netflix Personalization Infrastructure team, our job is to accelerate machine learning innovation by building tools that can remove pain points and allow researchers to focus on research. Polynote originated from a frustration with the shortcomings of existing notebook tools, especially with respect to their support of Scala.

For example, while Python developers are used to working inside an environment constructed using a package manager with a relatively small number of dependencies, Scala developers typically work in a project-based environment with a build tool managing hundreds of (often) conflicting dependencies. With Spark, developers are working in a cluster computing environment where it is imperative that their distributed code runs in a consistent environment no matter which node is being used. Finally, we found that our users were also frustrated with the code editing experience within notebooks, especially those accustomed to using IntelliJ IDEA or Eclipse.

Some problems are unique to the notebook experience. A notebook execution is a record of a particular piece of code, run at a particular point in time, in a particular environment. This combination of code, data and execution results into a single document makes notebooks powerful, but also difficult to reproduce. Indeed, the scientific computing community has documented some notebook reproducibility concerns as well as some best practices for reproducible notebooks.

Finally, another problem that might be unique to the ML space is the need for polyglot support. Machine learning researchers often work in multiple programming languages — for example, researchers might use Scala and Spark to generate training data (cleaning, subsampling, etc), while actual training might be done with popular Python ML libraries like tensorflow or scikit-learn.

Next, we’ll go through a deeper dive of Polynote’s features.

Reproducible by Design

Two of Polynote’s guiding principles are reproducibility and visibility. To further these goals, one of our earliest design decisions was to build Polynote’s code interpretation from scratch, rather than relying on a REPL like a traditional notebook.

We feel that while REPLs are great in general, they are fundamentally unfit for the notebook model. In order to understand the problems with REPLs and notebooks, let’s take a look at the design of a typical notebook environment.

A notebook is an ordered collection of cells, each of which can hold code or text. The contents of each cell can be modified and executed independently. Cells can be rearranged, inserted, and deleted. They can also depend on the output of other cells in the notebook.

Contrast this with a REPL environment. In a REPL session, a user inputs expressions into the prompt one at a time. Once evaluated, expressions and the results of their evaluation are immutable. Evaluation results are appended to the global state available to the next expression.

Unfortunately, the disconnect between these two models means that a typical notebook environment, which uses a REPL session to evaluate cell code, causes hidden state to accrue as users interact with the notebook. Cells can be executed in any order, mutating this global hidden state that in turn affects the execution of other cells. More often than not, notebooks are unable to be reliably rerun from the top, which makes them very difficult to reproduce and share with others. The hidden state also makes it difficult for users to reason about what’s going on in the notebook.

In other notebooks, hidden state means that a variable is still available after its cell is deleted.
In a Polynote notebook, there is no hidden state. A deleted cell’s variables are no longer available.

Writing Polynote’s code interpretation from scratch allowed us to do away with this global, mutable state. By keeping track of the variables defined in each cell, Polynote constructs the input state for a given cell based on the cells that have run above it. Making the position of a cell important in its execution semantics enforces the principle of least surprise, allowing users to read the notebook from top to bottom. It ensures reproducibility by making it far more likely that running the notebook sequentially will work.

Better editing

Let’s face it — for someone used to IDEs, writing a nontrivial amount of code in a notebook can feel like going back in time a few decades. We’ve seen users who prefer to write code in an IDE instead, and paste it into the notebook to run. While it’s not our goal to provide all the features of a full-fledged modern IDE, there are a few quality-of-life code editing enhancements that go a long way toward improving usability.

Code editing in Polynote integrates with the Monaco editor for interactive auto-complete.
Polynote highlights errors inside the code to help users quickly figure out what’s gone wrong.
Polynote provides a rich text editor for text cells.
The rich text editor allows users to easily insert LaTeX equations.


As we mentioned earlier, visibility is one of Polynote’s guiding principles. We want it to be easy to see what the kernel is doing at any given time, without needing to dive into logs. To that end, Polynote provides a variety of UI treatments that let users know what’s going on.

Here’s a snapshot of Polynote in the midst of some code execution.

There’s quite a bit of information available to the user from a single glance at this UI. First, it is clear from both the notebook view and task list that Cell 1 is currently running. We can also see that Cells 2 through 4 are queued to be run, in that order.

We can also see the exact statement currently being run is highlighted in blue — the line defining the value `sumOfRandomNumbers`. Finally, since evaluating that statement launches a Spark job, we can also see job- and stage-level Spark progress information in the task list..

Here’s an animation of that execution so we can see how Polynote makes it easy to follow along with the state of the kernel.

Executing a Polynote notebook

The symbol table provides insight into the notebook internal state. When a cell is selected, the symbol table shows any values that resulted from the current cell’s execution above a black line, and any values available to the cell (from previous cells) below the line. At the end of the animation, we show the symbol table updating as we click on each cell in turn.

Finally, the kernel status area provides information about the execution status of the kernel. Below, we show a closeup view of how the kernel status changes from idle and connected, in green, to busy, in yellow. Other states include disconnected, in gray, and dead or not started, in red.

Kernel status changing from green (idle and connected) to yellow (busy)


You may have noticed in the screenshots shown earlier that each cell has a language dropdown in its toolbar. That’s because Polynote supports truly polyglot notebooks, where each cell can be written in a different language!

When a cell is run, the kernel provides the available typed input values to the cell’s language interpreter. In turn, the interpreter provides the resulting typed output values back to the kernel. This allows cells in Polynote notebooks to operate within the same context, and use the same shared state, regardless of which language they are defined in — so users can pick the best tool for the job at hand.

Here’s an example using scikit-learn, a Python library, to compute an isotonic regression of a dataset generated with Scala. This code is adapted from the Isotonic Regression example on the scikit-learn website.

A polyglot example showing data generation in Scala and data analysis in Python

As this example shows, Polynote enables users to fluently move from one language to another within the same notebook.

Dependency and Configuration Management

In order to better facilitate reproducibility, Polynote stores configuration and dependency information directly in the notebook itself, rather than relying on external files or a cluster/server level configuration. We found that managing dependencies directly in the notebook code was clunky and could be confusing to users. Instead, Polynote provides a user-friendly Configuration section where users can set dependencies for each notebook.

Polynote’s Configuration UI, providing user-friendly, notebook-level configuration and dependency management

With this configuration, Polynote constructs an environment for the notebook. It fetches the dependencies locally (using Coursier or pip to fetch them from a repository) and loads the Scala dependencies into an isolated ClassLoader to reduce the chances of a class conflict with Spark libraries. Python dependencies are loaded into an isolated virtualenv. When Polynote is used in Spark mode, it creates a Spark Session for the notebook which uses the provided configuration. The Python and Scala dependencies are automatically added to the Spark Session.

Data Visualization

One of the most important use cases of notebooks is the ability to explore and visualize data. Polynote integrates with two of the most popular open source visualization libraries, Vega and Matplotlib.

While matplotlib integration is quite standard among notebooks, Polynote also has native support for data exploration — including a data schema view, table inspector, plot constructor and Vega support.

We’ll walk through a quick example of some data analysis and exploration using the tools mentioned above, using the Wine Reviews dataset from Kaggle. First, here’s a quick example of just loading the data in Spark, seeing the Schema, plotting it and saving that plot in the notebook.

Example of data exploration using the plot constructor

Let’s focus on some of what we’re seeing here.

View of the quick inspector, showing the DataFrame’s schema. The blue arrow points to the quick access buttons to the table view (left) and plot view (right)

If the last statement of a cell is an expression, it gets assigned to the cell’s Out variable. Polynote will display a representation of the result in a fashion determined by its data type. If it’s a table-like data type, such as a DataFrame or collection of case classes, Polynote shows the quick inspector, allowing users to see schema and type information at a glance.

The quick inspector also provides two buttons that bring up the full data inspector — the button on the left brings up the table view, while the button on the right brings up the plot constructor. The animation also shows the plot constructor and how users can drag and drop measures and dimensions to create different plots.

We also show how to save a plot to the notebook as its own cell. Because Polynote natively supports Vega specs, saving the plot simply inserts a new Vega cell with a generated spec. As with any other language, Vega specs can leverage polyglot support to refer to values from previous cells. In this case, we’re using the Out value (a DataFrame) and performing additional aggregations on it. This enables efficient plotting without having to bring millions of data points to the client. Polynote’s Vega spec language provides an API for aggregating and otherwise modifying table-like data streams.

A Vega cell generated by the plot constructor, showing its spec

Vega cells don’t need to be authored using the plot constructor — any Vega spec can be put into a Vega cell and plotted directly, as seen below.

Vega’s Stacked Area Chart Example displayed in Polynote

In addition to the cell result value, any variable in the symbol table can be inspected with a click.

Inspecting a variable in the symbol table

The road ahead

We have described some of the key features of Polynote here. We’re proud to share Polynote widely by open sourcing it, and we’d love to hear your feedback. Take it for a spin today by heading over to our website or directly to the code and let us know what you think! Take a look at our currently open issues and to see what we’re planning, and, of course, PRs are always welcome! Polynote is still very much in its infancy, so you may encounter some rough edges. It is also a powerful tool that enables arbitrary code execution (“with great power, comes great responsibility”), so please be cognizant of this when you use it in your environment.

Plenty of exciting work lies ahead. We are very optimistic about the potential of Polynote and we hope to learn from the community just as much as we hope they will find value from Polynote. If you are interested in working on Polynote or other Machine Learning research, engineering and infrastructure problems, check out the Netflix Research site as well as some of the current openings.


Many colleagues at Netflix helped us in the early stages of Polynote’s development. We would like to express our tremendous gratitude to Aish Fenton, Hua Jiang, Kedar Sadekar, Devesh Parekh, Christopher Alvino, and many others who provided thoughtful feedback along their journey as early adopters of Polynote.

Open-sourcing Polynote: an IDE-inspired polyglot notebook was originally published in Netflix TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

Read the whole story
3 hours ago
Akron, OH
8 hours ago
West Grove, PA
Share this story

Primer: Distributed Systems and Cloud Native Computing

1 Share

This post is part of an ongoing series from Catherine Paganini that focus on explaining IT concepts for business leaders.

Software is becoming a strategic competitive differentiator across industries — today’s race towards digital transformation is proof of it. Enterprises are investing (often millions of dollars) in technology to better serve their customer base. For business leaders who play a key role in the decision-making and planning process, it’s indispensable to have a basic understanding of how business applications work. What are the organizational constraints and what opportunities do new technology developments bring?

In this article, we’ll provide a high-level overview of what a distributed system or application is. We’ll discuss characteristics, design goals and scaling techniques, as well as types of distributed systems. Finally, we will look into how cloud native technology is changing the status quo.

The Proliferation of Distributed Systems

Starting in the mid-1980s, two technology advancements made distributed systems feasible.

First, there was the development of powerful microprocessors, later made even more powerful through multi-core central processing units (CPUs). This led to so-called parallelism where multiple processes could run at the same time. A single-core CPU, on the other hand, can only run one process at the time, although CPUs are able to switch between tasks so quickly that they appear to run processes simultaneously. As CPU capacity increased, more powerful applications were developed.

The second key development was the invention of high-speed computer networks. Local-area networks (LAN) allowed thousands of physically close machines to be connected and communicate with each other. Wide-area networks (WAN) enabled hundreds of millions of machines to communicate across the world. With literally no physical limit to computational capacity, enterprises were now able to create “supercomputers.”

Characteristics of Distributed Systems

A distributed system is a collection of autonomous computing elements that appear to its users as a single coherent system. Generally referred to as nodes, these components can be hardware devices (e.g. computer, mobile phone) or software processes. A good example is the internet — the world’s largest distributed system. Composed of millions of machines, to you, it feels like a single system. You have no idea where the data is stored, how many servers are involved, or how the information gets to your browser. This concept is called abstraction and reappears over and over again in IT. In short, your browser abstracts away the complexity of the internet. The same applies to applications like Gmail, Salesforce, or any enterprise application you may use. You literally interact with distributed applications every single day!

Nodes (machines or processes) are programmed to achieve a common goal. To collaborate, they need to exchange messages. Let’s say:

  1. Your browser sends a request to pull website information from a domain server (node A sends a request to node B);
  2. The domain server pulls the website info from its data store (node B processes the request),
  3. The domain server pushes the website code to your browser (node B sends a response to node A).

Clearly, communication is at the core of distributed systems; if it fails no collaboration is possible and your browser has nothing to display.

Middleware and APIs — the Glue Holding Distributed Systems Together

Distributed systems often have a separate software layer placed on top of their respective operating system (OS) called middleware. It creates standards allowing applications that aren’t necessarily compatible to communicate. It also offers a variety of services to the applications such as security or masking of and recovery from failure.

Today, we are hearing less about middleware and more about Application Programming Interfaces (APIs). APIs function as a gateway through which applications can communicate. It’s basically an application interface abstraction. For applications to communicate directly, they must have compatible interfaces which isn’t often the case. An API allows different applications to communicate through it. Hence, it abstracts away the implementation details and differences from the applications. Applications don’t need to know anything about other applications except for their API.

You may have heard about open APIs, there has been quite a bit of buzz around them, and with good reason. Some application developers open their APIs to the public so external developers can tap into their data. Google Maps and Yelp are great examples. Yelp taps into Google Maps’ API. You may have seen the little map next to the restaurant description. To get directions, you simply click on the map — how convenient! Open APIs have clearly made our lives a lot easier and, if you pay attention, you’ll see them everywhere.

While that doesn’t mean that middleware is facing extinction — it still has security, coordination, and management functions — APIs are largely substituting the communication aspect of it.

Design Goal

Distributed systems have four main goals:

  • Resource sharing: Whether storage facilities, data files, services, or networks, you may want to share these resources among applications. Why? It’s simple economics. It’s clearly cheaper to have one high-end reliable storage facility shared among multiple applications than buying and maintaining storage for each separately.
  • Abstraction: to hide the fact that processes and resources are distributed across multiple computers, possibly even geographically dispersed. In other words, as just discussed above, processes and resources are abstracted from the user.
  • Openness: An open distributed system is essentially a system built by components that can be easily used by, or integrated into other systems. Adhering to standardized interface rules, any arbitrary process (e.g. from a different manufacturer) with that interface can talk to a process with the same interface. Interface specifications should be complete (everything needed for an implementation is specified) and neutral (does not prescribe what an implementation should look like).

Completeness and neutrality are key for interoperability and portability. Interoperability means that two implementations from different manufacturers can work together. Portability characterizes the extent to which an app developed for system A will work on system B without modification.

Additionally, distributed systems should be extensible, allowing teams to easily add new or replace existing components without affecting those components staying in place. Flexibility is achieved by organizing a collection of relatively small and easily replaceable or adaptable components.

Scalability is also needed when there is a spike of users that needs more resources. A good example is the increase in viewership Netflix experiences every Friday evening. Scaling out means dynamically adding more resources (e.g. increase network capacity allowing for more video streaming) and scaling back once consumption has normalized.

Scaling Applications

Let’s have a closer look at scalability. Systems can scale up or out. To scale a system up, you can increase memory, upgrade CPUs, or replace network modules — the amount of machines remains the same. Scaling out, on the other hand, means extending the distributed system by adding more machines. Scaling up, however, is commonly used to refer to scaling out. So if IT is talking about scaling applications up, they are likely referring to scaling out (the cloud has made that really easy).

Catherine Paganini
Catherine Paganini leads Marketing at Kublr. From the strategic to the tactical, Catherine helps Kublr evangelize the limitless power of cloud native technologies, shape the brand, and keep pace with the growth. Before joining the tech startup, Catherine marketed B2B services at renowned organizations such as Booz Allen Hamilton and The Washington Post. She recently discovered her passion for breaking down complex IT concepts so people with no technical background can easily understand the current technological revolution brought to us by cloud native tech and digital transformation.

Scaling out is particularly important for applications that experience sudden spikes, requiring a lot more resources but for a limited time only, like in our Netflix example. Data analytics tools, for instance, may suddenly require more compute capacity when real-time data generation spikes or when algorithms are run. Building applications with the full resource capacity needed during these spikes is expensive — especially if those spikes happen rarely. Scalability allows applications to dynamically add resources during a spike (scale-out) and then return them to the resource pool once they aren’t needed anymore (scale-in). This enables other applications to use them when they need more resources, significantly increasing efficiencies and reducing costs. Scalability brings some challenges, however. Here are three techniques to address them:

  • Hiding communication latencies: when scaling applications out to geographically dispersed machines (e.g. using the cloud), delays in response to a remote-service request are inevitable. Networks are inherently unreliable leading to latencies that must be hidden from the user. This is where asynchronous communication can help.
  • Traditionally application communication was synchronous. Just like a phone line, synchronous communication is only possible while two processes are connected (two people speaking on phone). As soon as the connection drops, the message exchange stops. While connected, no further exchanges can happen (if someone else tries to call, it’ll be engaged). Delays in replies (latencies) mean that processes remain connected waiting for the reply to arrive, unable to accept new messages, slowing the entire system down.
  • Asynchronous communication, on the other hand, is more like email. Process A sends a message into the network and doesn’t care if B is online or not. The message will be delivered to process B’s “inbox” and processed when B is ready. Process A, meanwhile, can continue its work and will be notified once a response from B arrives.
  • Partitioning and distribution are generally used for large databases. By separating data into logical groups, each placed on a different machine, processes working on that dataset know where to access it. Amazon’s global employee database, for instance, which is presumably huge, may be partitioned by last name. When a search query for “John Smith” is executed, the program doesn’t have to scan all servers containing employee data. Instead, it will go directly to the server hosting data on employees whose last name starts with “s”, significantly increasing data access speed. This, by the way, is exactly how the Internet Domain Name System (DNS) is built. As you can imagine, it would be impossible to scan the entire internet each time you’re looking for a website. Instead, URLs indicate the path leading to the data.
  • Replication: Components are replicated across the distributed system to increase availability. For example, Wikipedia’s database is replicated throughout the world so users, no matter where they are, have quick access. In geographically widely dispersed systems, having a copy nearby can hide much of the communication latency. Replicas also help balance the load between components, increasing performance. Think Google Maps. During rush hour, there is a huge spike of traffic data that needs to be analyzed in order to provide timely predictions. By replicating processes that crunch that data, Google is able to balance the load between more processes returning faster results.
  • Caching is a special form of replication. It’s a copy of a resource, generally close to the user accessing that resource. You’re probably familiar with your browser’s cache. To load pages faster, your browser saves recently visited website data locally for a limited time.

Caching and replication bring a serious drawback that can adversely affect scalability: consistency. While total consistency is impossible to reach, to what extent inconsistencies can be tolerated depends on the application.

Cluster, Grid, and Cloud Computing

Distributed computing systems can be categorized into three categories:

  • Cluster computing is a collection of similar machines connected through a high-speed local-area network. Each node runs on the same hardware and OS. Cluster computing is often used for parallel programming in which a single compute-intensive program runs in parallel on multiple machines. Each cluster consists of a collection of compute nodes monitored and managed by one or more master nodes. The master handles things such as allocation of worker nodes to a particular process and management of request queues. It also provides the system interface for users. In short, the master manages the cluster while the workers run the actual program.

  • Grid computing is composed of nodes with stark differences in hardware and network technology. Today’s trend towards specifically configuring nodes for certain tasks has led to more diversity which is more prevalent in grid computing. No assumptions are made in terms of similarity in hardware, OS, network, or security policies. Note that in daily tech jargon cluster is generally used for both cluster and grid computing.
  • Cloud computing is a pool of virtualized resources hosted in a cloud provider’s datacenter. Customers can construct a virtualized infrastructure and leverage a variety of cloud services. Virtualized means that resources appear to be a single piece of hardware (e.g. individual machine or storage facility) but are actually a piece of software running on that hardware. A virtual machine (VM), for instance, is code that “wraps around” an application, pretending to be hardware. The code running inside the VM thinks the VM is a separate computer, hence the term “virtual” machine. To the client, it seems as if they are renting their own private machine. In reality, however, they are likely sharing it with other clients. The same applies to virtual storage or memory. These virtualized resources can be dynamically configured enabling scalability: if more compute resources are needed, the system can simply acquire more.

Since cloud computing is so prevalent, let’s have a quick look at how it’s organized. We can differentiate four layers:

  • Hardware consists of processors, routers, power, and cooling systems. They “live” in the cloud providers’ data center and are never seen by the user.
  • Infrastructure represents the backbone of cloud computing. Customers rent infrastructure consisting of VMs, virtual storage and other virtualized compute resources, and less frequently even bare-metal machines.
  • Platform provides developers with an easy way to develop and deploy applications in the cloud. Through a vendor-specific application programming interface (API), developers can upload and execute programs.
  • Application is cloud-hosted applications that developers can further customize or end-user applications.

Each layer represents an additional abstraction layer, meaning the user does not have nor need any knowledge about the underlying layers. Cloud providers offer a variety of services on each layer:

  • Infrastructure-as-a-Service: Amazon S3 or EC2;
  • Platform-as-a-Service: Google App engine or MS Azure;
  • Software-as-a-Service: Gmail, YouTube, Google Docs, Salesforce etc.

Cloud Native, Delivering the Next Level of Openness and Portability

With the rise of cloud native and open source technologies, we are hearing a lot about openness, portability, and flexibility. Yet the desire to build distributed systems based on these principles is by no means new. Cloud native technologies are bringing these concepts to a whole new level, however.

Before we get into the “how,” let’s step back and talk about what cloud-native technologies are. Similar to cloud managed services, cloud native technologies, among other things, provide services like storage, messaging, or service discovery. Unlike cloud managed services, they are infrastructure-independent, configurable, and in some cases more secure. While cloud providers were the driving force behind these services (which brought us unprecedented developer productivity!), open source projects and startups started to codify them and provide analogous services with one big advantage: they don’t lock you in. Now, the term cloud native can be a little misleading. While developed for the cloud, they are not cloud-bound. In fact, we are increasingly seeing enterprises deploying these technologies on-premise.

So, what’s the big deal? The new stack, as cloud native technologies are often referred to, is enabling organizations to build distributed systems that are more open, portable, and flexible than ever before, if implemented properly. Here are some examples of cloud native innovation:

  • Containers could be interpreted as the new lightweight VMs (though they may be deployed on VMs). Containers are much more infrastructure-independent and thus more portable across environments. Additionally, being more lightweight, they spin up faster and occupy less resources than VMs.
  • Kubernetes functions as some sort of data center OS managing resources for containerized applications across environments. If implemented correctly, Kubernetes can serve as an infrastructure abstraction where all your infrastructures (on-prem and clouds) become part of a single pool of resources. That means that developers don’t have to care where their applications run, they just deploy them on Kubernetes and Kubernetes takes care of the rest.
  • Cloud native services are the new cloud independent counterparts of cloud managed services. Services include storage (e.g. Portworx, Ceph, or Rook), messaging (e.g. RabbitMQ), or service discovery and configuration (e.g. etcd). They are self-hosted offering a lot more control and work across environments.
  • Microservices are applications that are broken down into micro components, referred to as services. Each service is self-contained and independent, and can thus be added, removed, or updated while the system is running. This eases extensibility even further.
  • Interoperability and extensibility: Cloud native technologies led to a shift from large and heavy technology solutions that lock customers in towards modular, open technologies you can plug and play into your architecture if the architecture was built in a layered fashion according to architectural best practices.

Cloud native technologies have the potential to create a truly open and flexible distributed system that future proofs your technology investments. This potential benefit is often compromised by building systems with a specific use case in mind and tying it to a particular technology stack or infrastructure. This ultimately transforms open source components into opinionated software negating the very benefits we all applaud.

To avoid this, systems should be built based on architectural best practices such as a clean separation between layers. While it requires a little more planning and discipline early on, it will speed adoption of newer technology down the line allowing organizations to pivot quickly with market demand.

Because this is so crucial, an enterprise-grade Kubernetes platform, which will be at the core of your cloud native stack, should be built on these very principles. So do your research, understand the implications, and keep future requirements in mind. Cloud native technologies offer the opportunity for a fresh new start — don’t lock yourself in again.

If this was useful or if you have any feedback, please use the comment section below or tweet at me.

Thanks to Oleg Chunikin who patiently clarified any questions and ensured I got all the technical details correct.

Feature image via Pixabay.

The post Primer: Distributed Systems and Cloud Native Computing appeared first on The New Stack.

Read the whole story
3 hours ago
Akron, OH
Share this story

3 Steps to De-Risk Your Day 2 Operations

1 Share

In a complex IT landscape that is changing rapidly and growing each day, enterprise companies are responding to the all-encompassing pressure to adopt new technologies. 


What complicates the situation further is the issue of risk. Enterprise companies want the benefits of new technologies, such as faster deployment and time-to-market, but they are aware of the risk of performance and stability issues typically associated with such technological transitions.


These risks multiply once applications are ready to scale and move beyond Day 1 implementation to Day 2 operations. The reality is, if your revenue and productivity is powered by cloud native applications, risk is not an acceptable option for most enterprise organizations.


To minimize risk, dodge the typical pitfalls, and be successful with your cloud native transition, you need a framework for an intelligent strategy that goes way beyond Day 1. D2iQ has developed a proven track record for enabling cloud native transformations, and in this ebook we take away the Day 2 guesswork by providing you with a 3-pronged blueprint for future-proofing your deployments for the long-term. 


By making the three moves outlined in this ebook, your organization will be able to fully embrace prevailing open source and cloud native innovations while realizing smarter Day 2 operations. In this ebook, you’ll learn the importance of:


  • Thinking long-term by designing with Day 2 in mind
  • Shifting your mindset from understanding newer technology to understanding processes
  • Identifying your organization’s unique production requirements

    Learn how to succeed at cloud native and improve your business strategy by downloading "3 Steps to De-Risk Your Day 2 Operations".

    Read the whole story
    3 hours ago
    Akron, OH
    Share this story

    Creating, monitoring, and testing cron jobs on AWS

    1 Share

    Cron jobs are everywhere—from scripts that run your data pipelines to automated cleanup of your development machine, from cleaning up unused resources in the cloud to sending email notifications. These tasks tend to happen unnoticed in the background. And in any business, there are bound to be many tasks that could be cron jobs but are instead processes run manually or as part of an unrelated application.

    Many companies want to take control of their cron jobs: to manage costs, to make sure the jobs are maintainable and the infrastructure running them is up to date, and to share the knowledge about how the jobs run. For those already bringing the rest of their infrastructure to Amazon’s public cloud, running cron jobs in AWS is an obvious choice.

    If you are a developer using AWS, and you’d like to bring your cron jobs over to AWS, there are two main options: use an EC2 machine—spin up a VM and configure cron jobs to run on it; or use AWS Lambda—a serverless computing service that abstracts away machine management and provides a simple interface for task automation.

    On the face of it, EC2 might seem like the right choice to run cron jobs, but over time you’ll find yourself starting to run into the following issues:

    1. Most cron jobs don’t need to be run every second, nor even every hour. This means that the EC2 machine reserved for the cron jobs is idle at least 90% of the time, not to mention that its resources aren’t being used efficiently.
    2. The machine running the cron jobs will of course require regular updates, and there must be a mechanism in place to handle that, whether it’s a Terraform description of the instance or a Chef cookbook.
    3. Did the cron job run last night? Did the average run time change in the last few weeks? Answering these and other questions will require adding more code to your cron job, which can be hard to do if your cron job is a simple Bash script.

    AWS Lambda addresses all of these issues. With its pay-per-use model, you only pay for the compute time used by your Lambda applications. For short-lived tasks, this can generate significant savings. When deploying Lambda with Serverless Framework, the description of all the infrastructure to which the function connects resides in the same repository as the application code. In addition, you get metrics, anomaly detection, and easy-to-use secrets management right out of the box.

    In this article we'll walk you through how to create a cron job on AWS using AWS Lambda and Serverless Framework, how to get the right alerts and security measures in place, and how to scale your cron jobs as needed. Take a look at our example repo for this article on GitHub if you’d like to follow along. Let’s dive in!

    Creating a cron job with AWS Lambda

    In this example we’ll walk through a cron job that performs database rollovers. Our use case: we want to archive the past week’s data from a production database in order to keep the database small while still keeping its data accessible. We start by defining all the details of our cron job application in the serverless.yml file in the root of our repository:

        # serverless.yml
        service: week-cron
          name: aws
          runtime: nodejs8.10
          region: 'us-east-1'
          frameworkVersion: ">=1.43.0"
          timeout: 900 # in seconds

    Our function needs to connect to our production database, so we supply the secrets we need for that database via environment variables:

        # serverless.yml
            DB_HOST: ${file(./secrets.json):DB_HOST}
            DB_USER: ${file(./secrets.json):DB_USER}
            DB_PASS: ${file(./secrets.json):DB_PASS}
            DB_NAME: ${file(./secrets.json):DB_NAME}

    We then add the description of our function. We want it to have a single function called transfer which performs the database rollover. We want the transfer function to run automatically every week at a time when the load for our application is the lowest, say around 3am on Mondays:

        # serverless.yml
            handler: handler.transfer
                # every Monday at 03:15 AM
                - schedule: cron(15 3 ? * MON *)

    Syntax for the Schedule expressions

    In our example above, the transfer handler gets run on a schedule specified in the events block, in this case via the schedule event. The syntax for the schedule event can be of two types:

    • rate — with this syntax you specify the rate at which your function should be triggered.

    The schedule event using the rate syntax must specify the rate as rate(value unit). The supported units are minute/minutes, hour/hours and day/days. If the value is 1 then the singular form of the unit should be used, otherwise you’ll need to use the plural form. For example:

              - schedule: rate(15 minutes)
              - schedule: rate(1 hour)
              - schedule: rate(2 days)
    • cron — this option is for specifying a more complex schedule using the Linux crontab syntax.

    The cron schedule events use the syntax cron(minute hour day-of-month month day-of-week year). You can specify multiple values for each unit separated by a comma, and a number of wildcards are available. See the AWS Schedule Expressions docs page for the full list of supported wildcards restrictions on using multiple wildcards together.

    These are valid schedule events:

        # the example from our serverless.yml that runs every Monday at 03:15AM UTC
        - schedule: cron(15 3 ? * MON *)
        # run in 10-minute increments from the start of the hour on all working days
        - schedule: cron(1/10 * ? * W *)
        # run every day at 6:00PM UTC
        - schedule: cron(0 18 * * ? *)

    You can specify multiple schedule events for each function in case you’d like to combine the schedules. It’s possible to combine rate and cron events on the same function, too.

    Business logic for transferring database records

    At this point, the description of the function is complete. The next step is to add code to the transfer handler. The handler.js file defining the handlers is quite short:

        // handler.js
        exports.transfer = require("./service/transfer_data").func;

    The actual application logic lives in the service/transfer_data.js file. Let’s take a look at that file next.

    The task that we want our application to accomplish is a database rollover. When it runs, the application goes through three steps:

    1. Ensures all the necessary database tables are created.
    2. Transfers data from the production table to a “monthly” table.
    3. Cleans up the data that has been transferred from the production table.

    We assume that no records with past dates can be added in the present, and that creating additional load on the production database is fine.

    We start by referencing the helper functions for each of the three tasks we defined above and initializing the utilities for date operations and database access:

        // service/transfer_data.js
        var monthTable = require('../database/create_month_table')
        var transferData = require('../database/transfer_data')
        var cleanupData = require('../database/cleanup_data')
        var dateUtil = require('../utils/date')
        const Client = require('serverless-mysql')

    The function code is quite straightforward: ensure the tables exist, transfer the data, delete the data, and log all the actions that are happening. The simplified version is below:

        // service/transfer_data.js
        exports.func = async () => {
            var client = Client({
                config: {
            var weeknumber = dateUtil.getWeekNumber(new Date())
            var currentWeek = weeknumber[1];
            var currentYear = weeknumber[0];
            try {
                await monthTable.create(client, currentWeek, currentYear)
                await transferData.transfer(client, currentWeek, currentYear)
                await cleanupData.cleanup(client, currentWeek, currentYear)
                } catch (error) {
                if (error.sqlMessage) {
                    // handle SQL errors
                } else {
                    // handle other errors
            return "success";

    You can find the full version of the file in our GitHub repository.

    The helper function for creating the monthly table exports a single create function that essentially consists of a SQL query:

        // database/create_month_table.js
        exports.create = async (client, week, year) => {
            await client.query(`
            CREATE TABLE IF NOT EXISTS weather_${year}_${week}
                id MEDIUMINT UNSIGNED not null AUTO_INCREMENT, 
                date TIMESTAMP,
                city varchar(100) not null, 
                temperature int not null, 
                PRIMARY KEY (id)

    The transfer_data helper is similar in structure with its own SQL query:

        // database/transfer_data.js
        exports.transfer = async (client, week, year) => {
            var anyId = await client.query(`select id from weather where YEAR(date)=? and WEEK(date)=?`, [year, week])
            if (anyId.length == 0) {
                console.log(`records does not exists for year = ${year} and week = ${week}`)
            await client.query(`
            INSERT INTO weather_${year}_${week}
            (date, city, temperature)
            select date, city, temperature 
            from weather
            where YEAR(date)=? and WEEK(date)=? 
            `, [year, week])

    And finally, the cleanup of the data in the cleanup helper looks like this:

        // database/cleanup_data.js
        exports.cleanup = async (client, week, year) => {
            var anyId = await client.query(`select id from weather where YEAR(date)=? and WEEK(date)=?`, [year, week])
            if (anyId.length == 0) {
                console.log(`cleanup did't needed, because does not exists records for year = ${year} and week = ${week}`)
            anyId = await client.query(`select id from weather_${year}_${week} limit 1`, [year, week])
            if (anyId.length == 0) {
                throw Error(`cleanup can't finished, because records are not transfered for year = ${year} and week = ${week} in`)
            await client.query(`
            from weather 
            where YEAR(date)=? and WEEK(date)=? 
            `, [year, week])

    With this, the core business logic is done. We also add a number of unit tests for the business logic that can be found in the [test]( directory in our repo.

    The next step is to deploy our cron job.

    Deploying our cron job to AWS

    Both the application code and the serverless.yml file are now set up. The remaining steps to deploy our cron job are as follows:

    • Install the Serverless Framework.
    • Install the required dependencies.
    • Run the deployment step.

    To install Serverless Framework, we run:

        $ npm install -g serverless

    To install our application’s dependencies, we run:

        $ npm install

    in the project directory.

    We now have two options for how to run the deployment step. One option involves setting up AWS credentials on your local machine, and the other is to set up the AWS credentials in the Serverless Dashboard without giving your local machine direct access to AWS.

    Option 1: Use AWS credentials on the development machine This option works well if you have only one person deploying a sample cron job, or if the developers on your team already have access to the relevant AWS production account. We don’t recommend this option for larger teams and production applications. Follow these steps:

    1. Make sure that the AWS CLI is installed locally. Try running aws --version, and if the CLI is not yet installed, run pip install awscli.
    2. Configure the AWS credentials for the AWS CLI by running aws configure.
    3. Once the credentials are set up, run serverless deploy to deploy the cron job.

    Option 2: Use the Serverless Dashboard to generate single-use AWS credentials for each deploy We recommend this option for teams with multiple developers. With this setup, you grant the Serverless Dashboard a set of AWS permissions, and for each deploy the Serverless Framework will generate a single-use credential with limited permissions to deploy your cron job.

    Before deploying, if you don’t yet have an account, sign up for the Serverless Dashboard. Once your account is set up, create a new application using the Add button:

    To make sure Serverless Framework knows which application to associate with our cron job, we add the tenant and app attributes to the serverless.yml file at the root level. You will need to replace the values shown here with the ones from your Serverless account:

        # serverless.yml
        # our Serverless Dashboard account name
        tenant: chiefwizard
        # our Serverless Dashboard application name
        app: cron-database-rollover

    After that, the steps to deploy are:

    1. In the Dashboard, navigate to Profiles → Create or choose a profile → AWS credential access role.

    2. Select Personal AWS Account and specify the IAM role you’d like to use for deployment. If the role doesn’t exist yet, click the Create a role link to create it.

    3. Click Save and Exit.

    4. Run serverless login in the console on your local machine and log in with your Serverless Dashboard credentials.

    5. Run serverless deploy without configuring the production AWS account on your machine.

    Done! The cron job is deployed and will run on the schedule that we configured in the serverless.yml file.

    Check out this YouTube video where we walk through the deployment process live.

    Setting up monitoring for your cron job

    When you deploy via the Serverless Dashboard (our recommended approach), the monitoring is already set up once the cron job is deployed. On the Applications page, we click through to the deployment we just ran:

    On the deployment page, when we go to the Overview section we see the list of alerts and a graph of function invocations and errors:

    It’s currently empty, but more information will appear as the cron job starts being invoked. On the Alerts tab, any alerts relevant to your cron job will be displayed. That’s it! No extra work needed to set up monitoring and alerting.

    Deploying via the Serverless Dashboard also allows you to use the Dashboard to look through recent invocations of your function (to be found in the Invocations Explorer tab), list all the deployments of your service, and more.

    Writing and running tests for your cron job

    To increase confidence in our cron job’s code, we’ll create a few unit tests to cover the main parts of our database rollover logic in the test/database_transfer_test.js file.

    We start by requiring all of our helper files and setting up the test data:

        // test/database_transfer_test.js
        var assert = require('assert');
        var fs = require("fs")
        var dateUtil = require('../utils/date')
        var monthTable = require('../database/create_month_table')
        var transferData = require('../database/transfer_data')
        var cleanupData = require('../database/cleanup_data')
        var init = require('../database/init_data')
        const Client = require('serverless-mysql')
        const secrets = JSON.parse(fs.readFileSync("secrets.json"));
        describe('Transfer test', function () {
            // initialize the database client
            var client = Client({
                config: {
            // set up the test vars

    Within the describe block, we add individual tests for our business logic. For example, in this snippet we test the monthTable.create helper function:

        // test/database_transfer_test.js
            describe('#monthTable.create(client, week, year)', function () {
                it('exists new table for week = 33 and year = 2018', async function () {
                    await client.query(`drop table if exists weather_${year}_${week}`)
                    await monthTable.create(client, week, year)
                    var anyId = await client.query(`SELECT table_schema db,table_name tb  FROM information_schema.TABLES
                    where table_name='weather_${year}_${week}'`)
                    assert.equal(anyId.length, 1);

    We continue in this fashion until all key pieces of our cron job are covered by unit tests (or, if you prefer, integration tests). See all the tests in our example repo.

    To run the tests, we need to make sure we have a MySQL instance running locally. If you need to install MySQL, pick a method that works for your from the MySQL Community Downloads page.

    On our Mac, we’ve installed MySQL using Homebrew, and to start it we run:

        $ brew services start mysql
        ==> Successfully started `mysql` (label: homebrew.mxcl.mysql)

    To create the test database, we connect to MySQL via the CLI:

        $ mysql -uroot
        Welcome to the MySQL monitor.  Commands end with ; or \g.
        Your MySQL connection id is 3
        Server version: 5.7.16 Homebrew
        Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
        Oracle is a registered trademark of Oracle Corporation and/or its
        affiliates. Other names may be trademarks of their respective
        Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
        mysql> create database testdb;
        Query OK, 1 row affected (0.00 sec)
        mysql> ^DBye

    In our secrets.json file, we set up the local credentials for the MySQL database:

        # secrets.json
          "DB_HOST": "",
          "DB_USER": "root",
          "DB_PASS": "",
          "DB_NAME": "testdb"

    Note: by default, MySQL root password is empty. Please consider changing the password to something more secure and make sure that you’re not exposing the database outside of your local development environment.

    With the credentials configured, we can now run the tests:

        npm test
        > cron-aws@1.0.0 test /Users/alexey/wizard/serverless-cron-job-example
        > mocha
          Transfer test
            #init(client, year, month, day, city)
              ✓ exists record for 2018/08/21 (167ms)
            #monthTable.create(client, week, year)
              ✓ exists new table for week = 33 and year = 2018
            #transferData.transfer(client, week, year)
              ✓ exists record in new and old table for week = 33 and year = 2018
            #cleanupData.cleanup(client, week, year)
              ✓ exists record in new table and not exists in old table for week = 33 and year = 2018
          Date utils test
              ✓ should return week = 33 and year = 2018 for 2018/08/21
          5 passing (205ms)

    Great! Our cron job is good to go.

    Iterating on the cron job

    In order to iterate on and update the cron job’s code, just run serverless deploy after you’ve made your changes to deploy the newest version. We recommend setting up a CI/CD pipeline to continuously validate and deploy your cron job every time you push changes to GitHub.

    The full example of the application we just walked through is available in our GitHub repo.


    In this article, we walked through creating and deploying a cron job on AWS with Serverless Framework. Using AWS Lambda may be a better fit for your cron jobs than AWS EC2, since with Lambda you pay only for what you use, and good infrastructure is already in place to deploy, monitor, and secure the cron jobs you create.

    Working with AWS Lambda directly can be challenging in terms of the developer experience. By using Serverless Framework, you get a easier deployment and iteration flow and also benefit from built-in AWS credentials management, zero-setup monitoring for your cron jobs, and more. Using Serverless Framework also helps you avoid vendor lock-in should you ever decide to migrate away from AWS.

    While we believe that using AWS Lambda with Serverless Framework is a great solution for most kinds of cron jobs, Lambda does have a number of limitations. If your jobs need to run for longer than 15 minutes, for example, or if your functions need access to special hardware (a GPU, for example), then using EC2 might be a better fit. In addition, when you have a very high volume of cron jobs running simultaneously, using EC2 might well be more cost-effective in the long run.

    You can find the full example that we walked through in our GitHub repo.

    Check out the details of Serverless Framework on the Serverless website. The Serverless AWS docs might be helpful, as well as the reference for the Serverless Dashboard.

    Find more examples of Serverless applications on our Examples page.

    Read the whole story
    3 hours ago
    Akron, OH
    Share this story

    Version 1.0.2 released

    1 Share

    The Nim team is happy to announce version 1.0.2, our first patch release following Nim 1.0.0.

    To read more about version 1.0.0, take a look at our release article from just a month ago.

    Although this release comes only one month after a previous release, it has over 60 new commits, fixing over 40 reported issues, making our 1.0 release even better.

    Installing 1.0.2

    If you have installed a previous version of Nim using choosenim, getting Nim 1.0.2 is as easy as:

    $ choosenim update stable

    If you don’t have it already, you can get choosenim by following these instructions or you can install Nim by following the instructions on our install page.


    Find this release's changelog together with the rest of Nim's source code in our GitHub repository.
    Read the whole story
    3 hours ago
    Akron, OH
    Share this story

    PowerShell 7 Preview 5


    Today we shipped PowerShell 7 Preview5! This release contains a number of new features and many bug fixes from both the community as well as the PowerShell team. See the Release Notes for all the details of what is included in this release.

    We are still on track to have one more preview release next month in November. Then, barring any quality concerns, a Release Candidate in December aligned with the .NET Core 3.1 final release. Finally, we expect General Availability of PowerShell 7 in January as our first Long Term Servicing release.

    Between the Release Candidate and General Availability, we will only accept critical bug fixes and no new features will be included. For that release, some Experimental Features will be considered design stable and no longer be Experimental. This means that any future design changes for those features will be considered a breaking change.

    New Features in Preview 5

    This release has a number of new features from both the community as well as the PowerShell team. Remember that preview releases of PowerShell are installed side-by-side with stable versions so you can use both and provide us feedback on previews for bugs and also on experimental features.

    You can read about new features in previous preview releases:

    There were new features in Preview 1 and Preview 2, but I didn’t blog about them… sorry!

    Chain operators

    The new Pipeline Chain Operators allow conditional execution of commands depending on whether the previous command succeeded for failed. This works with both native commands as well as PowerShell cmdlets or functions. Prior to this feature, you could already do this by use of if statements along with checking if $? indicated that the last statement succeeded or failed. This new operator makes this simpler and consistent with other shells.


    Null conditional operators for coalescing and assignment

    Often in your scripts, you may need to check if a variable is $null or if a property is $null before using it. The new Null conditional operators makes this simpler.

    The new ?? null coalescing operator removes the need for if and else statements if you want to get the value of a statement if it’s not $null or return something else if it is $null. Note that this doesn’t replace the check for a boolean value of true or false, it’s only checking if it’s $null.

    The new ??= null conditional assignment operator makes it easy to assign a variable a value only if it’s not $null.


    New PowerShell version notification

    Our telemetry available in our PowerBI Dashboard indicates some set of users are still using older versions (sometimes older previews of released stable versions!). This new feature will inform you on startup if a new preview version is available (if you are using a preview version) or if a new stable version is available to keep you up-to-date on the latest servicing release which may contain security fixes. Because this is new, you won’t see this in action until Preview 6 comes out.

    More details of this feature including how to disable it in the Notification on Version Update RFC


    Tab completion for variable assignment

    This new feature will allow you to use tab completion on variable assignment and get allowed values for enums or variables with type constraints like [ValidateSet()]. This makes it easy to change $ErrorActionPreference or the new $ErrorView (detailed below) to valid values without having to type them out.


    Format-Hex improved formatting

    This improvement comes from Joel Sallow making Format-Hex more useful when viewing different types of objects in a pipeline as well as supporting viewing more types of objects.


    Get-HotFix is back

    The Get-HotFix cmdlet only works on Windows and will query the system on what patches have been installed. This was previously unavailable in PowerShell Core 6 because it depended on System.Management namespace which wasn’t available on .NET Core 2.x which PowerShell Core 6.x is built on. However, .NET Core 3.0 which PowerShell 7 is built on brought back this namespace (for Windows only) so we re-enabled this cmdlet.

    There is a delay getting results in this example due to the number of patches I have on my Windows 7 VM.


    Select-String adds emphasis

    This was a HackIllinois project by Derek Xia that uses inverse colored text to highlight the text in a string that matches the selection criteria. There is an optional -NoEmphasis switch to suppress the emphasis.


    ConciseView for errors

    Some user feedback we’ve consistently received is about the amount of red text you get when you encounter an error in PowerShell.

    The $ErrorView preference variable allows you to change the formatting of errors. Previously, it supported NormalView (the default) as well as a more terse CategoryView. This feature adds a ConciseView where most commands return just the relevant error message. In cases where there is additional contextual information in a script file or the location in a script block, you get the line number, the line of text in question, and a pointer to where the error occurred.

    This new view is part of the Update Error View RFC so please provide feedback there!


    Get-Error cmdlet

    While ConciseView gives you more precise, but limited information on errors, we added a new cmdlet Get-Error to get much richer information on errors.

    By default, just running Get-Error shows a formatted view of the most recent error including showing specific nested types like Exceptions and ErrorRecords making it easier to diagnose what went wrong.

    This new cmdlet is part of the Update Error View RFC so please provide feedback there!



    We have one more preview planned for November with a few more features coming from the PowerShell team as well as the PowerShell community!

    Steve Lee
    PowerShell Team

    The post PowerShell 7 Preview 5 appeared first on PowerShell.

    Read the whole story
    3 hours ago
    Akron, OH
    8 hours ago
    West Grove, PA
    Share this story
    Next Page of Stories