Hari Sekhon - DevOps Perl Tools

Linux, Web, Anonymizer, SQL ReCaser, Hadoop, Hive, Solr, Big Data & NoSQL Tools

DevOps, Linux, SQL, Web, Big Data, NoSQL, templates for various programming languages and Kubernetes. All programs have --help.

See also the DevOps Bash Tools, DevOps Python Tools and Advanced Nagios Plugins Collection repos which contains hundreds more scripts and programs for Cloud, Big Data, SQL, NoSQL, Web and Linux.

Hari Sekhon

Cloud & Big Data Contractor, United Kingdom

(you're welcome to connect with me on LinkedIn)

Make sure you run `make update` if updating and not just `git pull` as you will often need the latest library submodule and possibly new upstream libraries

Quick Start

Ready to run Docker image

All programs and their pre-compiled dependencies can be found ready to run on DockerHub.

List all programs:

docker run harisekhon/perl-tools

Run any given program:

docker run harisekhon/perl-tools <program> <args>

Automated Build from source

installs git, make, pulls the repo and build the dependencies:

curl -L https://git.io/perl-bootstrap | sh

or manually

git clone https://github.com/harisekhon/devops-perl-tools perl-tools
cd perl-tools
make

Make sure to read Detailed Build Instructions further down for more information.

Optional: Generate self-contained Perl scripts with all dependencies built in to each file for easy distribution

After the make build has finished, if you want to make self-contained versions of all the perl scripts with all dependencies included for copying around, run:

make fatpacks

The self-contained scripts will be available in the fatpacks/ directory which is also tarred to fatpacks.tar.gz.

Usage

All programs come with a --help switch which includes a program description and the list of command line options.

Environment variables are supported for convenience and also to hide credentials from being exposed in the process list eg. $PASSWORD. These are indicated in the --help descriptions in brackets next to each option and often have more specific overrides with higher precedence eg. $SOLR_HOST takes priority over $HOST.

Tools

NOTE: Hadoop HDFS API Tools, Pig => Elasticsearch/Solr, Pig Jython UDFs and authenticated PySpark IPython Notebook have moved to my DevOps Python Tools repo

Linux:
- anonymize.pl - anonymizes your configs / logs from files or stdin (for pasting to Apache Jira tickets or mailing lists)
  - anonymizes:
    - hostnames / domains / FQDNs
    - email addresses
    - IP + MAC addresses
    - Kerberos principals
    - Cisco & Juniper ScreenOS configurations passwords, shared keys and SNMP strings
  - anonymize_custom.conf - put regex of your Name/Company/Project/Database/Tables to anonymize to <custom>
  - placeholder tokens indicate what was stripped out (eg. <fqdn>, <password>, <custom>)
  - --ip-prefix leaves the last IP octect to aid in cluster debugging to still see differentiated nodes communicating with each other to compare configs and log communications
- sqlcase.pl - capitalizes SQL code in files or stdin:
  - *case.pl - more specific language support for just about every database and SQL-like language out there plus a few more non-SQL languages like Neo4j Cypher and Docker's Dockerfiles:
    - athenacase.pl - AWS Athena SQL
    - cqlcase.pl - Cassandra CQL
    - cyphercase.pl - Neo4j Cypher
    - dockercase.pl - Docker (Dockerfiles)
    - drillcase.pl - Apache Drill SQL
    - hivecase.pl - Hive HQL
    - impalacase.pl - Impala SQL
    - influxcase.pl - InfluxDB InfluxQL
    - mssqlcase.pl - Microsoft SQL Server SQL
    - mysqlcase.pl - MySQL SQL
    - n1qlcase.pl - Couchbase N1QL
    - oraclecase.pl / plsqlcase.pl - Oracle SQL
    - postgrescase.pl / pgsqlcase.pl - PostgreSQL SQL
    - pigcase.pl - Pig Latin
    - prestocase.pl - Presto SQL
    - redshiftcase..pl - AWS Redshift SQL
    - snowflakecase..pl - Snowflake SQL
  - written to help clean up docs and SQL scripts (I don't even bother writing capitalised SQL code any more I just run it through this via a vim shortcut)
- diffnet.pl - simplifies diff output to show only lines added/removed, not moved, from patch files or stdin (pipe from standard diff command)
- xml_diff.pl / hadoop_config_diff.pl - tool to help find differences between XML / Hadoop configs, can diff XML from HTTP addresses to diff live running clusters
- titlecase.pl - capitalizes the first letter of each input word in files or stdin
- pdf_to_txt.pl - converts PDF to text for analytics (see also Apache PDFBox and pdf2text unix tool)
- java_show_classpath.pl - shows java classpaths of a running Java program in a sane way
- flock.pl - file locking to prevent running the same program twice at the same time. RHEL 6 now has a native version of this
- uniq_order_preserved.pl - like uniq but you don't have to sort first and it preserves the ordering
- colors.pl - prints ASCII color code matrix of all foreground + background combinations showing the corresponding terminal escape codes to help with tuning your shell
- matrix.pl - prints a cool matrix of vertical scrolling characters using terminal tricks
- welcome.pl - cool spinning welcome message greeting your username and showing last login time and user to put in your shell's .profile (there is also a python version in my DevOps Python Tools repo)
Web:
- watch_url.pl - watches a given url, outputting status code and optionally selected output, useful for debugging web farms behind load balancers and seeing the distribution to different servers (tip: set a /hostname handler to return which server you're hitting for each request in real-time). I also use this a ping replacement to google.com to check internet networking in environments where everything except HTTP traffic is blocked
- watch_nginx_stats.pl - watches nginx stats via the HttpStubStatusModule module
Hadoop Ecosystem:
- ambari_freeipa_kerberos_setup.pl - Automates Hadoop Ambari cluster security Kerberos setup of FreeIPA principals and keytab distribution to the cluster nodes
- hadoop_hdfs_file_age_out.pl - prints or removes all Hadoop HDFS files in a given directory tree older than a specified age
- hadoop_hdfs_snapshot_age_out.pl - prints or removes Hadoop HDFS snapshots older than a given age or matching a given regex pattern
- hbase_flush_tables.sh - flushes all or selected HBase tables (useful when bulk loading OpenTSDB with Durability.SKIP_WAL) (there is also a Python version of this in my DevOps Python Tools repo)
- hive_to_elasticsearch.pl - bulk indexes structured Hive tables in Hadoop to Elasticsearch clusters - includes support for Kerberos, Hive partitioned tables with selected partitions, selected columns, index creation with configurable sharding, index aliasing and optimization
- hive_table_print_null_columns.pl - finds Hive columns with all NULLs (see newer versions in DevOps Python tools repo for HiveServer2 and Impala)
- hive_table_count_rows_with_nulls.pl - counts number of rows containing NULLs in any field
- pentaho_backup.pl - script to back up the local Pentaho BA or DI Server
- ibm_bigsheets_config_git.pl - revision controls IBM BigSheets configurations from API to Git
- datameer_config_git.pl - revision controls Datameer configurations from API to Git
- hadoop_config_diff.pl - tool to diff configs between Hadoop clusters XML from files or live HTTP config endpoints
- solr_cli.pl - Solr CLI tool for fast and easy Solr / SolrCloud administration. Supports optional environment variables to minimize --switches (can be set permanently in solr/solr-env.sh). Uses the Solr Cores and Collections APIs, makes Solr administration a lot easier

Detailed Build Instructions

The 'make' command will initialize my library submodule and use 'sudo' to install the required system packages and CPAN modules. If you want more control over what is installed you must follow the Manual Setup section instead.

Perlbrew localized installs

The automated build will use 'sudo' to install required Perl CPAN libraries to the system unless running as root or it detects being inside Perlbrew. If you want to install some of the common Perl libraries such as Net::DNS and LWP::* using your OS packages instead of installing from CPAN then follow the Manual Build section below.

Manual Setup

Enter the tools directory and run git submodule init and git submodule update to fetch my library repo and then install the CPAN modules as mentioned further down:

git clone https://github.com/harisekhon/devops-perl-tools perl-tools
cd tools
git submodule update --init

Then proceed to install the CPAN modules below by hand.

CPAN Modules

Install the following CPAN modules using the cpan command, using sudo if you're not root:

sudo cpan JSON LWP::Simple LWP::UserAgent Term::ReadKey Text::Unidecode Time::HiRes XML::LibXML XML::Validate ...

The full list of CPAN modules is in setup/cpan-requirements.txt.

You can install the entire list of cpan requirements like so:

sudo cpan $(sed 's/#.*//' < setup/cpan-requirements.txt)

You're now ready to use these programs.

Offline Setup

Download the Tools and Lib git repos as zip files:

https://github.com/HariSekhon/devops-perl-tools/archive/master.zip

https://github.com/HariSekhon/lib/archive/master.zip

Unzip both and move Lib to the lib folder under Tools.

unzip devops-perl-tools-master.zip
unzip lib-master.zip

mv -v devops-perl-tools-master perl-tools
mv -v lib-master lib
mv -vf lib perl-tools/

Proceed to install CPAN modules for whichever programs you want to use using your standard procedure - usually an internal mirror or proxy server to CPAN, or rpms / debs (some libraries are packaged by Linux distributions).

All CPAN modules are listed in the setup/cpan-requirements.txt file.

Configuration for Strict Domain / FQDN validation

Strict validations include host/domain/FQDNs using TLDs which are populated from the official IANA list. This is done via my Lib submodule - see there for details on configuring to permit custom TLDs like .local, .intranet, .vm, .cloud etc. (all already included in there because they're common across companies internal environments).

Updating

Run make update. This will git pull and then git submodule update which is necessary to pick up corresponding library updates.

If you update often and want to just quickly git pull + submodule update but skip rebuilding all those dependencies each time then run make update-no-recompile (will miss new library dependencies - do full make update if you encounter issues).

Testing

Continuous Integration is run on this repo with tests for success and failure scenarios:

unit tests for the custom supporting perl library
integration tests of the top level programs using the libraries for things like option parsing
functional tests for the top level programs using local test data and Docker containers

To trigger all tests run:

make test

which will start with the underlying libraries, then move on to top level integration tests and functional tests using docker containers if docker is available.

Contributions

Patches, improvements and even general feedback are welcome in the form of GitHub pull requests and issue tickets.

Stargazers over time

git.io/perl-tools

Name		Name	Last commit message	Last commit date
Latest commit History 3,012 Commits
.buildkite		.buildkite
.circleci		.circleci
.github		.github
.semaphore		.semaphore
bash-tools @ a6913bd		bash-tools @ a6913bd
contrib		contrib
lib @ 22065d0		lib @ 22065d0
setup		setup
solr		solr
sql @ c83e4a4		sql @ c83e4a4
sql-keywords @ b5c47c6		sql-keywords @ b5c47c6
templates @ 03efa39		templates @ 03efa39
tests		tests
.appveyor.yml		.appveyor.yml
.cirrus.yml		.cirrus.yml
.concourse.yml		.concourse.yml
.drone.yml		.drone.yml
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
.gocd.yml		.gocd.yml
.sonarcloud.properties		.sonarcloud.properties
.travis.yml		.travis.yml
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
Makefile		Makefile
Makefile.PL		Makefile.PL
README.md		README.md
ambari-new-nodes-dn-nm.csv.tmpl		ambari-new-nodes-dn-nm.csv.tmpl
ambari_freeipa_kerberos_setup.pl		ambari_freeipa_kerberos_setup.pl
anonymize.pl		anonymize.pl
anonymize_custom.conf		anonymize_custom.conf
anonymize_ignore.conf		anonymize_ignore.conf
athenacase.pl		athenacase.pl
azure-pipelines.yml		azure-pipelines.yml
bitbucket-pipelines.yml		bitbucket-pipelines.yml
boot		boot
buddy.yml		buddy.yml
cassandra_cqlcase.pl		cassandra_cqlcase.pl
codefresh.yml		codefresh.yml
colors.pl		colors.pl
cqlcase.pl		cqlcase.pl
cyphercase.pl		cyphercase.pl
datameer_config_git.pl		datameer_config_git.pl
diffnet.pl		diffnet.pl
dockercase.pl		dockercase.pl
drillcase.pl		drillcase.pl
flock.pl		flock.pl
hadoop_config_diff.pl		hadoop_config_diff.pl
hadoop_hdfs_file_age_out.pl		hadoop_hdfs_file_age_out.pl
hadoop_hdfs_snapshot_age_out.pl		hadoop_hdfs_snapshot_age_out.pl
hbase_flush_tables.sh		hbase_flush_tables.sh
hive_table_count_rows_with_nulls.pl		hive_table_count_rows_with_nulls.pl
hive_table_print_null_columns.pl		hive_table_print_null_columns.pl
hive_to_elasticsearch.pl		hive_to_elasticsearch.pl
hivecase.pl		hivecase.pl
ibm_bigsheets_config_git.pl		ibm_bigsheets_config_git.pl
impalacase.pl		impalacase.pl
influxcase.pl		influxcase.pl
java_show_classpath.pl		java_show_classpath.pl
matrix.pl		matrix.pl
mssqlcase.pl		mssqlcase.pl
mysqlcase.pl		mysqlcase.pl
n1qlcase.pl		n1qlcase.pl
neo4j_cyphercase.pl		neo4j_cyphercase.pl
new.pl		new.pl
oraclecase.pl		oraclecase.pl
pdf2txt.pl		pdf2txt.pl
pentaho_backup.pl		pentaho_backup.pl
perl_find_library_path.pl		perl_find_library_path.pl
perlpath.pl		perlpath.pl
pgsqlcase.pl		pgsqlcase.pl
pigcase.pl		pigcase.pl
plsqlcase.pl		plsqlcase.pl
postgrescase.pl		postgrescase.pl
prestocase.pl		prestocase.pl
recase.pl		recase.pl
recase_keywords.conf		recase_keywords.conf
redshiftcase.pl		redshiftcase.pl
shippable.yml		shippable.yml
snowflakecase.pl		snowflakecase.pl
solr_cli.pl		solr_cli.pl
sqlcase.pl		sqlcase.pl
strip_ansi_escape_codes.pl		strip_ansi_escape_codes.pl
titlecase.pl		titlecase.pl
unidecode.pl		unidecode.pl
uniq_order_preserved.pl		uniq_order_preserved.pl
urldecode.pl		urldecode.pl
urlencode.pl		urlencode.pl
watch_nginx_stats.pl		watch_nginx_stats.pl
watch_url.pl		watch_url.pl
welcome.pl		welcome.pl
wercker.yml		wercker.yml
xml-diff.xsl		xml-diff.xsl
xml_diff.pl		xml_diff.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hari Sekhon - DevOps Perl Tools

Linux, Web, Anonymizer, SQL ReCaser, Hadoop, Hive, Solr, Big Data & NoSQL Tools

(you're welcome to connect with me on LinkedIn)

Make sure you run `make update` if updating and not just `git pull` as you will often need the latest library submodule and possibly new upstream libraries

Quick Start

Ready to run Docker image

Automated Build from source

Optional: Generate self-contained Perl scripts with all dependencies built in to each file for easy distribution

Usage

Tools

NOTE: Hadoop HDFS API Tools, Pig => Elasticsearch/Solr, Pig Jython UDFs and authenticated PySpark IPython Notebook have moved to my DevOps Python Tools repo

Detailed Build Instructions

Perlbrew localized installs

Manual Setup

CPAN Modules

Offline Setup

Configuration for Strict Domain / FQDN validation

Updating

Testing

Contributions

See Also

Stargazers over time

About

Releases

Packages

Languages

License

dhd80/DevOps-Perl-tools

Folders and files

Latest commit

History

Repository files navigation

Hari Sekhon - DevOps Perl Tools

Linux, Web, Anonymizer, SQL ReCaser, Hadoop, Hive, Solr, Big Data & NoSQL Tools

(you're welcome to connect with me on LinkedIn)

Make sure you run make update if updating and not just git pull as you will often need the latest library submodule and possibly new upstream libraries

Quick Start

Ready to run Docker image

Automated Build from source

Optional: Generate self-contained Perl scripts with all dependencies built in to each file for easy distribution

Usage

Tools

NOTE: Hadoop HDFS API Tools, Pig => Elasticsearch/Solr, Pig Jython UDFs and authenticated PySpark IPython Notebook have moved to my DevOps Python Tools repo

Detailed Build Instructions

Perlbrew localized installs

Manual Setup

CPAN Modules

Offline Setup

Configuration for Strict Domain / FQDN validation

Updating

Testing

Contributions

See Also

Stargazers over time

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Make sure you run `make update` if updating and not just `git pull` as you will often need the latest library submodule and possibly new upstream libraries

Packages