Ruby Open Text File and Read Lines Bash Terminal

I-liner introduction

This chapter will give an overview of ruby syntax for command line usage and some examples to show what kind of problems are typically suited for one-liners.

Why utilize Ruby for one-liners?

I presume y'all are already familiar with use cases where command line is more than productive compared to GUI. See as well this series of articles titled Unix as IDE.

A beat out utility similar bash provides congenital-in commands and scripting features to brand information technology easier to solve and automate various tasks. External *cipher commands like grep, sed, awk, sort, discover, parallel etc tin can be combined to work with each other. Depending upon your familiarity with those tools, you tin can either use red as a single replacement or complement them for specific use cases.

Here's some i-liners (options will exist explained later):

  • blood-red -e 'puts readlines.uniq' *.txt — retain only one copy if lines are duplicated from the given listing of input file(southward)
  • ruby-red -e 'puts readlines.uniq {|due south| s.split[ane]}' *.txt — retain just first copy of duplicate lines using 2nd field as duplicate criteria
  • blood-red -rcommonregex -ne 'puts CommonRegex.get_links($_)' *.md — extract just the URLs, using a tertiary-party CommonRegexRuby library
  • stackoverflow: merge duplicate cardinal values while preserving guild — where I provide a simpler ruby solution compared to awk
  • unix.stackexchange: pair each line of file — an example where crimson'south vast built-in features makes information technology easier to write a solution

The main reward of ruby over tools like grep, sed and awk includes feature rich regular expression engine, standard library and third-party libraries. If y'all don't already know the syntax and idioms for sed and awk, learning command line options for reddish would exist the easier option. Another reward is that ruby is more portable, given the many differences between GNU, BSD, Mac and other such implementations. The main disadvantage is that ruby is likely to be verbose and slower for features that are supported out of the box by those tools.

Command line options

Selection Description
-0[octal] specify record separator (\0, if no statement)
-a autosplit fashion with -n or -p (splits $_ into $F)
-c check syntax only
-Cdirectory cd to directory before executing your script
-d set debugging flags (set $DEBUG to true)
-e 'command' one line of script. Several -eastward's allowed. Omit [programfile]
-Eex[:in] specify the default external and internal character encodings
-Fpattern separate() pattern for autosplit (-a)
-i[extension] edit ARGV files in place (make fill-in if extension supplied)
-Idirectory specify $LOAD_PATH directory (may exist used more than one time)
-l enable line ending processing
-n assume 'while gets(); ... stop' loop effectually your script
-p presume loop like -due north but impress line also like sed
-rlibrary require the library before executing your script
-s enable some switch parsing for switches afterwards script proper name
-S look for the script using PATH surround variable
-five print the version number, then turn on verbose mode
-westward turn warnings on for your script
-W[level=2|:category] gear up warning level; 0=silence, ane=medium, 2=verbose
-10[directory] strip off text before #!reddish line and perhaps cd to directory
--jit enable JIT with default options (experimental)
--jit-[selection] enable JIT with an option (experimental)
-h show this message, --help for more than info

This chapter will show examples with -e, -north, -p and -a options. Some more options will be covered in later chapters, but not all of them are discussed in this book.

Executing Ruby code

If you desire to execute a ruby plan file, i way is to pass the filename as argument to the ruby command.

                $ echo 'puts "Hi Cherry-red"' > hello.rb $ ruby-red how-do-you-do.rb Hello Ruby                              

For short programs, you lot can also directly pass the code equally an statement to the -e choice.

                $ ruby -e 'puts "Hello Ruby"' Hello Ruby  $ # multiple statements tin exist issued separated by ; $ cherry-red -e '10=25; y=12; puts ten**y' 59604644775390625 $ # or use -e choice multiple times $ ruby -e '10=25' -east 'y=12' -e 'puts ten**y' 59604644775390625                              

Filtering

ruby ane-liners tin be used for filtering lines matched by a regexp, similar to grep, sed and awk. And similar to many command line utilities, ruby can accept input from both stdin and file arguments.

                $ # sample stdin data $ printf 'gate\napple\nwhat\nkite\due north' gate apple what kite  $ # print all lines containing 'at' $ # aforementioned as: grep 'at' and sed -northward '/at/p' and awk '/at/' $ printf 'gate\napple\nwhat\nkite\n' | ruby -ne 'print if /at/' gate what  $ # impress all lines Not containing 'e' $ # same as: grep -v 'east' and sed -n '/e/!p' and awk '!/e/' $ printf 'gate\napple\nwhat\nkite\north' | carmine -ne 'impress if !/eastward/' what                              

Past default, grep, sed and awk volition automatically loop over input content line by line (with \n as the line distinguishing grapheme). The -n or -p selection will enable this characteristic for ruby. Equally seen before, the -eastward option accepts code equally control line argument. Many shortcuts are available to reduce the amount of typing needed.

In the higher up examples, a regular expression (defined past the pattern betwixt a pair of frontward slashes) has been used to filter the input. When the input string isn't specified in a conditional context (for case: if), the test is performed confronting global variable $_, which has the contents of the input line (the correct term would be input tape, see Record separators chapter). To summarize, in a conditional context:

  • /regexp/ is a shortcut for $_ =~ /regexp/
  • !/regexp/ is a shortcut for $_ !~ /regexp/

$_ is as well the default argument for print method, which is why it is mostly preferred in 1-liners over puts method. More than such defaults that use to the print method will be discussed later.

info Run into carmine-physician: Pre-defined global variables for documentation on $_, $&, etc.

Here'southward an instance with file input instead of stdin.

                $ cat table.txt brown bread mat pilus 42 blue cake mug shirt -7 yellow banana window shoes 3.xiv  $ # same equally: grep -oE '[0-9]+$' table.txt $ ruby -ne 'puts $& if /\d+$/' table.txt 42 7 14                              

info The learn_ruby_oneliners repo has all the files used in examples.

Substitution

Utilise sub and gsub methods for search and replace requirements. Past default, these methods operate on $_ when the input string isn't provided. For these examples, -p option is used instead of -n selection, so that the value of $_ is automatically printed after processing each input line.

                $ # for each input line, change only first ':' to '-' $ # same every bit: sed 'south/:/-/' and awk '{sub(/:/, "-")} i' $ printf 'i:2:3:4\na:b:c:d\due north' | cerise -pe 'sub(/:/, "-")' 1-two:iii:4 a-b:c:d  $ # for each input line, alter all ':' to '-' $ # same as: sed 'south/:/-/g' and awk '{gsub(/:/, "-")} 1' $ printf '1:2:iii:4\na:b:c:d\n' | ruby -pe 'gsub(/:/, "-")' 1-2-3-4 a-b-c-d                              

Y'all might wonder how $_ is modified without the use of ! methods. The reason is that these methods are part of Kernel (meet carmine-doctor: Kernel for details) and are available only when -n or -p options are used.

  • sub(/regexp/, repl) is a shortcut for $_.sub(/regexp/, repl) and $_ will be updated if commutation succeeds
  • gsub(/regexp/, repl) is a shortcut for $_.gsub(/regexp/, repl) and $_ gets updated if substitution succeeds

info This volume assumes you are already familiar with regular expressions. If not, you tin check out my free Crimson Regexp volume.

Field processing

Consider the sample input file shown below with fields separated by a single space character.

                $ cat table.txt brownish bread mat pilus 42 blue cake mug shirt -7 yellow banana window shoes 3.fourteen                              

Here'due south some examples that is based on specific field rather than the entire line. The -a selection volition crusade the input line to exist divide based on whitespaces and the field contents tin be accessed using $F global variable. Leading and trailing whitespaces will be suppressed and won't result in empty fields. More than details is discussed in Default field separation department.

                $ # print the second field of each input line $ # same as: awk '{print $2}' tabular array.txt $ ruby -ane 'puts $F[1]' table.txt bread cake banana  $ # print lines only if terminal field is a negative number $ # aforementioned as: awk '$NF<0' tabular array.txt $ ruddy -one 'impress if $F[-one].to_f < 0' tabular array.txt blue block mug shirt -7  $ # change 'b' to 'B' only for the kickoff field $ # same as: awk '{gsub(/b/, "B", $ane)} one' tabular array.txt $ scarlet -ane '$F[0].gsub!(/b/, "B"); puts $F * " "' tabular array.txt Brown breadstuff mat hair 42 Blue block mug shirt -vii yellowish banana window shoes 3.14                              

Begin and Terminate

You lot can use a Begin{} block when you demand to execute something before input is read and a END{} block to execute something after all of the input has been processed.

                $ # same as: awk 'Brainstorm{print "---"} 1; END{impress "%%%"}' $ # annotation the use of ; later on BEGIN cake $ seq iv | ruddy -pe 'BEGIN{puts "---"}; END{puts "%%%"}' --- 1 ii 3 iv %%%                              

ENV hash

When information technology comes to automation and scripting, y'all'd often need to construct commands that tin can accept input from user, file, output of a shell command, etc. As mentioned earlier, this book assumes bash as the beat being used. To admission environs variables of the shell, you can call the special hash variable ENV with the proper noun of the environment variable as a string key.

                $ # existing environment variable $ # output shown here is for my car, would differ for you lot $ ruby -e 'puts ENV["HOME"]' /home/learnbyexample $ ruby-red -eastward 'puts ENV["SHELL"]' /bin/bash  $ # defined forth with cerise command $ # note that the variable is placed earlier the shell command $ word='hello' ruby -due east 'puts ENV["discussion"]' hi $ # the input characters are preserved every bit is $ ip='hi\nbye' cerise -east 'puts ENV["ip"]' how-do-you-do\nbye                              

Hither'southward another example when a regexp is passed every bit an environment variable content.

                $ true cat word_anchors.txt sub par spar credible effort two spare computers cart function tart mart  $ # assume 'r' is a crush variable that has to be passed to the ruby control $ r='\Bpar\B' $ rgx="$r" ruby -ne 'print if /#{ENV["rgx"]}/' word_anchors.txt credible endeavor 2 spare computers                              

You can also make apply of the -s pick to assign a global variable.

                $ r='\Bpar\B' $ ruby -sne 'print if /#{$rgx}/' -- -rgx="$r" word_anchors.txt apparent attempt two spare computers                              

info As an example, see my repo ch: control help for a practical trounce script, where commands are synthetic dynamically.

Executing external commands

Y'all can call external commands using the system Kernel method. See ruby-doc: system for documentation.

                $ ruddy -e 'organization("echo Hello World")' Hello Earth  $ ruby -eastward 'organization("wc -west <word_anchors.txt")' 12  $ ruby -e 'system("seq -south, 10 > out.txt")' $ cat out.txt one,2,iii,four,5,six,seven,8,9,10                              

Return value of system or global variable $? tin can exist used to human action upon the exit status of the control issued.

                $ red -e 'es=system("ls word_anchors.txt"); puts es' word_anchors.txt truthful $ cherry -e 'system("ls word_anchors.txt"); puts $?' word_anchors.txt pid 6087 leave 0 $ cerise -e 'arrangement("ls xyz.txt"); puts $?' ls: cannot access 'xyz.txt': No such file or directory pid 6164 get out 2                              

To relieve the consequence of an external control, employ backticks or %ten.

                $ ruby-red -e 'words = `wc -w <word_anchors.txt`; puts words' 12  $ red -due east 'nums = %10/seq 3/; impress nums' 1 2 3                              

info See also stackoverflow: departure betwixt exec, system and %x() or backticks

Summary

This affiliate introduced some of the mutual options for red cli usage, along with typical cli text processing examples. While specific purpose cli tools like grep, sed and awk are usually faster, blood-red has a much more extensive standard library and ecosystem. And you practise non have to larn a lot if you are comfortable with red simply not familiar with those cli tools. The next department has a few exercises for you to practice the cli options and text processing apply cases.

Exercises

info Exercise related files are available from exercises binder of learn_ruby_oneliners repo.

info All the exercises are also collated together in 1 place at Exercises.md. For solutions, see Exercise_solutions.physician.

a) For the input file ip.txt, display all lines containing is.

                $ cat ip.txt Hello World How are you This game is good Today is sunny 12345 You are funny  ##### add your solution here This game is skillful Today is sunny                              

b) For the input file ip.txt, display first field of lines not containing y. Consider space as the field separator for this file.

                ##### add your solution here Howdy This 12345                              

c) For the input file ip.txt, display all lines containing no more than 2 fields.

                ##### add your solution hither Hello World 12345                              

d) For the input file ip.txt, display all lines containing is in the second field.

                ##### add your solution here Today is sunny                              

e) For each line of the input file ip.txt, supersede outset occurrence of o with 0.

                ##### add together your solution here Hell0 Earth H0w are you This game is g0od T0day is sunny 12345 Y0u are funny                              

f) For the input file tabular array.txt, calculate and display the product of numbers in the last field of each line. Consider infinite every bit the field separator for this file.

                $ true cat tabular array.txt brown bread mat hair 42 blue cake mug shirt -7 yellow assistant window shoes three.14  ##### add your solution here -923.1600000000001                              

one thousand) Suspend . to all the input lines for the given stdin data.

                $ printf 'last\nappend\nstop\due north' | ##### add together your solution here last. append. cease.                              

h) Apply contents of s variable to display all matching lines from the input file ip.txt. Assume that s doesn't have whatsoever regexp metacharacters. Construct the solution such that there's at to the lowest degree i word grapheme immediately preceding the contents of s variable.

                $ s='is'  ##### add your solution hither This game is skillful                              

i) Utilise arrangement to display contents of filename present in 2d field (infinite separated) of the given input line.

                $ s='report.log ip.txt sorted.txt' $ echo "$s" | ##### add your solution here Hi World How are you This game is skillful Today is sunny 12345 You are funny  $ s='power.txt table.txt' $ echo "$s" | ##### add your solution here brownish bread mat hair 42 blue cake mug shirt -7 yellow assistant window shoes three.14                              

macdonaldshoutered.blogspot.com

Source: https://learnbyexample.github.io/learn_ruby_oneliners/one-liner-introduction.html

0 Response to "Ruby Open Text File and Read Lines Bash Terminal"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel