Pepper-Box Load Generator

Mar 31, 2017. | By: Satish Bhor

Pepper-Box is a Kafka load generator application which can be used as plugin for JMeter or standalone utility.

It allows to send kafka messages of type plain text(JSON, XML, CSV or any other custom format) as well as java serialized objects. Pepper-Box includes template engine and random data generation function which helps to design message in any format.

If we use it with JMeter then we can use all JMeter features. Pepper-Box is very useful in streaming analytics and data pipelines implementation, where input data format is tightly coupled with business problems.

Pepper-Box include four main components:

Pepper-Box Kafka Sampler :

This is JMeter java sampler and acts as kafka producer.This sampler gets messages from backend data generator and sends messages to kafka broker at given throttled rate.If Kafka broker is designed with data encryption using SSL/TLS and authentication using Kerberos then these securities can be easily configured in this sampler. By default we included required and tuning related parameters but It provides user interface to configure other kafka producer parameters as well.

Pepper-Box PlainText Config Element :

This JMeter config element generates plaintext messages for Kafka sampler based on input message schema designed using template engine. This config element provides user interface to enter message schema template. Before test starts this config elements takes schema, processes it and creates plain text message iterator which generates a millions of plain text messages per second.

Pepper-Box Serialized Config Element :

This JMeter config element generates serialized object messages for kafka sampler based on input class and its field mappings with template functions.This config element element takes class name and its field mappings with random data generation functions, process it and creates Java object iterator which generates a millions of serialized object messages per second.

Pepper-Box Console Load Generator :

This is standalone kafka load generator and does not require Jmeter. This feature currently only supports plain text message generation and java serialized message generation is in feature scope. This console based load generator takes required details message schema file, producer property file, throttle rate, duration, number of producer threads etc. and starts generating load at given throttled rate.

You can see below sample JSON message schema with five fields and values are template functions which will be replaced with generates random values dynamically for every iteration,

"messageId":{{SEQUENCE("messageId", 1, 1)}},
"messageBody":"{{RANDOM_ALPHA_NUMERIC("abcdefg", 2)}}",
"messageCategory":"{{RANDOM_STRING("Finance","Shares","Healthcare")}}", "messageStatus":"{{RANDOM_STRING("Accepted","Pending","Processing")}}",

Pepper-Box workflow

When we enter input schema(sample schema shown above) and start test, then Pepper-Box follows series of steps before producing actual load on kafka.

  1. Schema is given as input to schema parser which parses schema and prepares series of expressions
  2. Schema translator then converts those series of expressions into a class.
  3. Then translated java class compiled and intentioned as message iterator.
  4. To produce messages on kafka this message iterator is iterated for specified test duration.

Open Source Contribution to Docker

Feb 28, 2017. | By: Milind Chawre

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure so you can deliver software quickly. With Docker, you can manage your infrastructure in the same ways you manage your applications. In past, there were physical servers running application. Then virtualization changed this way of running application by having virtual servers running these applications. Now, Docker enables the creation of application containers that are much more efficient than running separate virtual machines.

In a very short period of time, docker has witnessed large community support which includes active contribution from industry leaders like Google, RedHat, etc. There are around 1000 active community contributors to docker repository, 100000+ dockerized apps on docker hub, 50,000 third-party projects on GitHub using Docker. Many software giants are playing with docker eco-system to evaluate if they can containerize theirs applications. Cloud-vendors like Microsoft Azure, Amazon have enabled containerization support in their cloud platforms using docker. Docker is also natively supported by openstack. Google also built management suit called kubernetes, which supports docker. Almost all major players in industry are already involved or trying to be part of docker eco-system in one or other way. So it’s currently a trending technology which experts claim is here to stay.

To get acquainted with this trending technology, we proactively started exploring docker and playing around with its usecases. To further build the expertise, we started taking up some issues and try fixing them. Here is the glance at our contributions so far

  1. Hostname resolution for local NFS volumes: Docker enables to mount NFS volumes inside containers. This functionality limits us to use only IP addresses to identify NFS volumes. In production, common practice is to address NFS volumes with hostname instead. Thus we added a feature in docker that would enable address resolution for NFS volumes from docker via all means host names, DNS names, IP addresses.

    Issue :

    PR :

  2. Max restart time support for docker container: Docker offers restart policy, which defines whether to restart the container after it crashes. As per the restart policy, the failed container will be restarted after a specified interval. If the restart fails, the time period of restart interval will be doubled. As there no upper bound to this time interval, the container may not be restarted for days or weeks if it is failing continuously. We solved this issue by keeping an upperbound to this restart interval to be a maximum of 1 minute.

    Issue :

    PR :

  3. Improved docker cli help: Few of the docker cli commands were missing help for the allowed time units input. We added explicit help specifying all the valid time units. Default values for some docker cli input parameters were missing and were unknown to the user. So we identified default values for various parameters and their meanings. We updated docker cli help to reflect the same.

    Issue :

    PR :

    Issue :

    PR :

  4. Remove obsolete docker documentation/code: In docker remote API, login token name registryToken was replaced by identityToken. But the documentation was still holding reference to older login token name to registryToken. We updated the documentation to make it consistent with the code.

    Issue :

    PR :

    Removed unused configurations for fluentd logging driver.

    Issue :

    PR :

    Removed bad references in docker volume documentation.

    Issue :

    PR :

  5. Clarifying docker behaviour in documentation: Docker allows to attach a network to created/stopped container. But when we inspect the container, network information is missing. This is because network is attached to the container only when its run. This information was missing from docker documentation, which we added.

    Issue :

    PR :

    Docker added new feature called live-restore, this functionality is supported for linux containers only. The functionality is not implemented for windows containers. We added this note in live-restore features documentation.

    Issue :

    PRs :

    There are two conflicting options in dockerd – ip-masq and – iptables. – ip-masq which automatically adds iptables rules to enable external connectivity and – iptables to enable or disable dockerd from automatically updating iptables rules. If iptables is set to false and ip masq to true, iptables option takes precedence and dockerd does not automatically update the iptables rules. We documented this behaviour in dockerd help and documentation.

    Issue :

    PR :


Benchmarking MySQL NDB Cluster with ScaleArc using SysBench

Feb 17, 2017. | By: Trilok Khairnar

While NoSQL and NewSQL systems are maturing as high-performance data store options and being adopted increasingly, relational databases are based on a proven and solid model. Several scalable products still use them and need sharding, caching, routing when their databases grow into large clusters to protect their investment without significant re-engineering and operate 24×7.Read More

Fluent API Client for Openstack - Trove

Dec 1, 2016. | By: Shital Patil, Sumit Gandhi

OpenStack Trove is a DBaaS (Database as a service) solution. It offers IT organizations the ability to operate a complete DBaaS platform within the enterprise. IT organizations can offer a rich variety of databases to their internal customers with the same ease of use that Amazon offers with its AWS cloud and the RDS product. Openstack trove supports both RDBMS as well as NoSQL databases.Read More

OTP Library

Aug 23, 2016. | By: Abdul Waheed

Many organizations today still struggle with providing strong authentication for their web-based applications. Most organizations continue to rely solely on passwords for user authentication, which tend to be weak (to be easy to memorize), shared across systems, etc. Though there have been strides towards strong authentication mechanism like 2FA, adoptance has been low.Read More

Openstack 4j CLI

Jul 13, 2016. | By: Vinod Borole

This project was started with a thought of having an easy automation tool to interact with Openstack. Considering the challenges one has with existing Openstack CLI, this tool offers a very good starting point in overcoming those challenges.

Fluent API Client for Openstack - Group Based Policy

May 23, 2016. | By: Vinod Borole

With the popularity of Openstack and growing community-based initiative in more than hundred countries; there is a major community-based initiative by thousands of contributors. It’s time to focus on real challenges that involve deployment and delivery of applications and services with flexibility, security, speed and scale rather than just orchestration of infrastructure components. In order to achieve this there is a need for a declarative policy engine. One such project is Group Based policy.

Recent Posts


GS Lab is committed to open source technology and the commitment to collaboration is innate in our DNA. GSOpenLab is committed to drive the growth of open source technologies with its efforts on community-led initiatives, collaborations and industry focused open source code development.

Social Links

Our Bunker

Amar Arma Genesis,
Baner Road, Baner,
Pune 411045 India,