Download Files from FTP using JSch java library

SSH provides support for secure remote login(login to remote server similar to putty), secure file transfer(SCP or FTP/SFTP download), and secure TCP/IP and X11 forwardings. JSch is a Java implementation of SSH2 protocal.

In this example we will see how we can use JSch library to login to SFTP server and download files.

First, add the following dependency to your pom.xml
     <dependency>  
       <groupId>com.jcraft</groupId>  
       <artifactId>jsch</artifactId>  
       <version>0.1.54</version>  <!-- or latest version -->
     </dependency>  

JSch apis are pretty simple. First you create a session and open a channel then you can use one of the many function such as CD, LS, PUT, GET to change directory, list content, upload file or download respectively.

Create Session:


JSch jsch = new JSch();
Session session = jsch.getSession("demo", "test.rebex.net", 22);
session.setPassword("password");
session.connect();

Create Channel:

ChannelSftp channel = (ChannelSftp) session.openChannel("sftp");
channel.connect();

Change folder:

channelSftp.cd("/a/folder");

List content of a folder:

Vector<ChannelSftp.LsEntry> entries = channelSftp.ls(folder);

Download file:

channelSftp.get(String fileNameInFtp, String  destinationFile);

Upload File:

channelSftp. put(String src, String dst) //default is overwrite
channelSftp. put(String src, String dst, int mode)
Upload Modes:
public static final int
OVERWRITE=0;
public static final int RESUME=1;
public static final int APPEND=2;

A Complete Example Code to download files from FTP:

In this example, we are using a publicly available ftp server  as described in https://test.rebex.net/
 import com.jcraft.jsch.*;  
 import java.io.File;  
 import java.util.*;  
 public class JschDownload {  
public static void main(String[] args) { Session session = null; ChannelSftp channel = null; try { JSch jsch = new JSch(); session = jsch.getSession("demo", "test.rebex.net", 22); session.setPassword("password");
//to prevent following exception for sftp //com.jcraft.jsch.JSchException: UnknownHostKey: test.rebex.net. RSA key fingerprint is .. Properties config = new Properties(); config.put("StrictHostKeyChecking", "no"); session.setConfig(config); session.connect(); System.out.println("session connected");
//various channels are supported eg: shell, x11, channel = (ChannelSftp) session.openChannel("sftp"); channel.connect(); System.out.println("channel connected");
downloadFromFolder(channel, "/"); downloadFromFolder(channel, "/pub/example/");
     //in order to download all files including sub-folders/sub-sub-folder, we should iterate recursively System.out.println("File Uploaded to FTP Server Successfully.");
} catch (Exception e) { e.printStackTrace(); } finally { if (channel != null) { channel.disconnect(); } if (channel != null) { session.disconnect(); } } } static void downloadFromFolder(ChannelSftp channelSftp, String folder) throws SftpException { Vector<ChannelSftp.LsEntry> entries = channelSftp.ls(folder); new File("download").mkdir();
//download all files (except the ., .. and folders) from given folder for (ChannelSftp.LsEntry en : entries) { if (en.getFilename().equals(".") || en.getFilename().equals("..") || en.getAttrs().isDir()) { continue; }
System.out.println("Downloading " + (folder + en.getFilename()) + " ----> " + "download" + File.separator + en.getFilename()); channelSftp.get(folder + en.getFilename(), "download" + File.separator + en.getFilename()); } } }


Spock - call mocked method multiple times and return different results for same input

Spock - return different mock result for same input

In unit testing, we create and use mock objects for any complex/real object that's impractical or impossible to incorporate into a unit test. Generally any component that's outside of the scope of unit test are mocked. Mocking frameworks like JMock, EasyMock, Mockito provide an easy way to describe the expected behavior of a component without writing the full implementation of the object being mocked.

In Groovy world, the Spock testing framework includes powerful mocking capabilities without requiring additional mocking libraries.

In this article, I am going to describe how we can create mock objects that can be called multiple times and and each time they return multiple values.

First, let's see how we can return Fixed Values from a mocked method: for this, we use the right-shift (>>) operator to return a fixed value:
//the input parameter is _(any) and it will return "ok" everytime
mockObj.method(_) >> "ok"
To return different values for different invocation
//return 'ok' for param1 and 'not-ok' for param2
mockObj.method("param1") >> "ok"
mockObj.method("param2") >> "not-ok"

Finally, to return different values for same parameter: for this, we use triple-right shift (>>>) operator.
mockObj.method("param") >>> ["", "ok", "not-ok"]

It will return empty value for first invocation, "ok" for second and "not-ok" for third and rest of the invocation.

GraalVM setup and generate native image using maven plugin

GraalVM Setup and native image generation using Maven plugin

Today we are going to generate native image(one of many features of GraalVM) using GraalVM for the XML Parser that we developed earlier. Native image will contain the whole program in machine code ready for its immediate execution. It has the following advantages:
Ref: https://www.graalvm.org/docs/why-graal/#create-a-native-image
  • faster startup time
  • no need for JVM(JDK/JRE) to execute the application
  • low memory footprint
 Steps:

1) GraalVM setup

I used sdkman to install GraalVM SDK setup in my Linux machine. I used the following steps. First I listed all available JDK distributions and then I ran sdk install to install the latest GraalVM version. At the end of the installation I selected Yes to enable this version as default JDK.

sdk list java
sdk install java 20.1.0.r11-grl 

Then I verified the installation using following
java -version 

I got the following. So, everything working great so far:
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02)
OpenJDK 64-Bit Server VM GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02, mixed mode, sharing)


If you want to do it manually, download the zip file and extract it and add to system path and enable that as default JDK.

2) Native image tools installation

Before you can use GraalVM native image utility,  you need to have a working C developer environment. For this:

- On Linux, you will need GCC, and the glibc and zlib headers. 
Examples for common distributions:

    # dnf (rpm-based)
    sudo dnf install gcc glibc-devel zlib-devel libstdc++-static
    # Debian-based distributions:
    sudo apt-get install build-essential libz-dev zlib1g-dev

- On MacOS
    XCode provides the required dependencies on macOS:

    xcode-select --install

- On Windows, you will need to install the Visual Studio 2017 Visual C++ Build Tools


After this, you can run the following to install the native-image utility
$JAVA_HOME/bin/gu install native-image  

Here, $JAVA_HOME is your GraalVM installation directory

3) Finally, use GraalVM native image Maven plugin to generate native-image during package phase

For this, I added the following on my XML Parser's pom.xml file:  

Dependency:
<dependency>
    <groupId>org.graalvm.sdk</groupId>
    <artifactId>graal-sdk</artifactId>
    <version>${graalvm.version}</version>
    <scope>provided</scope>
</dependency>



Plugin: It automatically detects the jar file and the main class from the jar file. I've specified the imageName = xmltocsv as the executable

<plugin>
    <groupId>org.graalvm.nativeimage</groupId>
    <artifactId>native-image-maven-plugin</artifactId>
    <version>${graalvm.version}</version>
    <executions>
        <execution>
            <goals>
                <goal>native-image</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <!--The plugin figures out what jar files it needs to pass to the native image
        and what the executable main class should be. -->
        <!--<mainClass>${app.mainClass}</mainClass>-->
        <imageName>xmltocsv</imageName>
        <buildArgs>
            --no-fallback
        </buildArgs>
        <skip>false</skip>
    </configuration>
</plugin>


The version:
<graalvm.version>20.1.0</graalvm.version>

And ran with following to generate the native image
mvnw clean package native-image:native-image




 It produced the following files on my target folder

── target
│   ├── xmltocsv   //this is the binary file, it can run without jvm
│   └── xmltocsv-FINAL.jar  //this required JVM to run


4) Testing

In my linux machine I executed the xmltocsv binary
$ ./target/xmltocsv ../big3.xml ../big3.csv

It started faster, used less memory but took little longer(because we lost JVM optimizations) to convert the file than running the jar file to do the same.

The complete example code is available here: https://github.com/gtiwari333/java-read-big-xml-to-csv

Create a huge data file for load testing

In this short blog, I am going to describe how you can create a big file for load testing. In most cases, you will only need step #2 to combine join big files.

For my earlier blog read-huge-xml-file-and-convert-to-csv, I needed to create a very big xml file (6GB +) without crashing my machine !!

My sample XML file would look like following with millions of <book> element.

<catalog>  
   <book id="001" lang="ENG">  
     ..  
   </book>  
   <book id="002" lang="ENG">  
     ..  
   </book>  
   ...  
 </catalog>  

Steps:
1) Since the start of the file contained <catalog> and file ended with </catalog>, I striped the start and end line and created a small file with just a few <book> elements.

//small.xml
<book id="001" lang="ENG">  
    ..  
 </book>  
 <book id="002" lang="ENG">  
    ..  
 </book>  
 <book id="003" lang="ENG">  
    ..  
 </book>  
 <book id="004" lang="ENG">  
    ..  
 </book>  


2) Used 'cat' to join files. The following would create bigger.xml by combining five small.xml files

cat small.xml  small.xml  small.xml  small.xml  small.xml >> bigger.xml


I can further do the following to gain exponential file size

cat bigger.xml  bigger.xml  bigger.xml  bigger.xml  bigger.xml >> evenbigger.xml

3) finally I used 'sed' to add <catalog> at beginning  and </catalog>  at end to create a proper xml file

 sed -i '1s/^/<catalog> /' bigger.xml
 sed -i -e '$a</catalog>' bigger.xml


4) Let's verify using tail and head

  head -10 bigger.xml
  tail -10 bigger.xml

I can see the <catalog>  at start and </catalog> at end. Hurray....

java read huge xml file and convert to csv

SAX parser uses event handler org.xml.sax.helpers.DefaultHandler to efficiently parse and handle the intermediate results of an XML file.  

It provides the following three important methods on each event where we can write custom logic to take specific action at each events:
  • startDocument() and endDocument() – Method called at the start and end of an XML document. 
  • startElement() and endElement() – Method called at the start and end of a document element.  
  • characters() – Method called with the text contents in between the start and end tags of an XML document element.
We will be using this class to read a HUGE xml file (6.58GB, it should support any size without any problem) efficiently and convert and write to CSV file.

I am going to use my existing code from my old blog xml-parsing-using-saxparser and updating it for this purpose. The final code is available on github project java-read-big-xml-to-csv


Java HUGE XML to CSV - project structure

How to Import/Run:

Its a simple maven project(with no dependencies). You can import it into your IDE or  use command line to compile and run.
If you plan on using Command Line, to compile and create a runnable jar file, go to the root of the project and run mvnw clean package .
Then you can run the executable as following:
java -jar target\xmltocsv-FINAL.jar  C:\folder\input.xml  C:\folder\output.csv

The code:

SaxParseEventHandler 
SaxParseEventHandler class takes the RecordWriter as constructor parameter
public SaxParseEventHandler(RecordWriter<Book> writer) {


We create new book record on startElement event
public void startElement(String s, String s1, String elementName, Attributes attributes) { /* handle start of a new Book tag and attributes of an element */ if (elementName.equalsIgnoreCase("book")) { //start bookTmp = new Book();


and we write the parsed book data to file on endElement() event.
public void endElement(String s, String s1, String element) { if (element.equals("book")) { //end writer.write(bookTmp, counter);





RecordWriter:
Its a simple wrapper for FileWriter to write content to file. We are currently writing T.toString() to file.
public void write(T t, int n) throws IOException { fw.write(t.toString()); if (n % 10000 == 0) { fw.flush(); } }

Main:
Its the main 'launcher' class
SAXParserFactory factory = SAXParserFactory.newInstance(); try (RecordWriter<Book> w = new RecordWriter<>(outputCSV)) { SAXParser parser = factory.newSAXParser(); parser.parse(inputXml, new SaxParseEventHandler(w)); }






Results at 16GB RAM, Core i5, 6MB L3 cache, SSD | Windows Machine
Max RAM usage: 190MB
Time Taken:
For the file big2.xml with size 118MB
- JDK8 - 8-9 sec
- JDK 11 - 6-7 sec
- JDK 14 - 5 sec 

big3.xml with size 6.58GB takes about 2 minutes


Next Steps: create a binary using GraalVM. I will keep posting !!