Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Read all table and columns in Jpa/Hibernate

How to get metadata about all table and columns managed by JPA/Hibernate?

There are many ways to get a list of table and columns in your project that uses JPA/Hibernate. Each has pros and cons.

Option A) Direct Query on INFORMATION_SCHEMA.

Simplest way is to do direct query on INFORMATION_SCHEMA  or similar schema that the database internally uses.

For MySQL, H2, MariaDB etc the following would work. We will need specific query for each database.

SELECT * from INFORMATION_SCHEMA.TABLES
SELECT * FROM INFORMATION_SCHEMA.COLUMNS

Option B) DB Independent query using JDBC API

We can do DB independent query by using JDBC API to return the Metadata. It would use DB specific query provided by DB driver to return the metadata.

DataSource ds = ; //create/wire DataSource object
DatabaseMetaData metaData = ds.getConnection().getMetaData();
ResultSet schemasRS = metaData.getSchemas();
ResultSet tablesRS = metaData.getTables(null, null, null, new String[]{"TABLE"});

We can iterate over the ResultSet to get the schema, table and columns. It would return everything that the Database has.

Option C) Use EntityManager MetaModel to read Entity classes

In order to retrieve only the entity/tables that the application uses, we can rely on some Java Reflection magic as following:

EntityManager em; //autowire the bean
MetamodelImplementor metaModelImpl = (MetamodelImplementor) em.getMetamodel();
List<String> tableNames = metaModelImpl
.entityPersisters()
.values().stream()
.map(ep -> ((AbstractEntityPersister) ep).getTableName())
.toList();

Option D) Use Hibernate Magic

Use hibernate's Metadata class that stores the ORM model determined by provided entity mappings.

org.hibernate.boot.Metadata metadata; //getting the Metadata is tricky though

for (PersistentClass persistentClass : metadata.getEntityBindings()) {
tableNames.add(persistentClass.getTable().getExportIdentifier());
}



Download Files from FTP using JSch java library

SSH provides support for secure remote login(login to remote server similar to putty), secure file transfer(SCP or FTP/SFTP download), and secure TCP/IP and X11 forwardings. JSch is a Java implementation of SSH2 protocal.

In this example we will see how we can use JSch library to login to SFTP server and download files.

First, add the following dependency to your pom.xml
     <dependency>  
       <groupId>com.jcraft</groupId>  
       <artifactId>jsch</artifactId>  
       <version>0.1.54</version>  <!-- or latest version -->
     </dependency>  

JSch apis are pretty simple. First you create a session and open a channel then you can use one of the many function such as CD, LS, PUT, GET to change directory, list content, upload file or download respectively.

Create Session:


JSch jsch = new JSch();
Session session = jsch.getSession("demo", "test.rebex.net", 22);
session.setPassword("password");
session.connect();

Create Channel:

ChannelSftp channel = (ChannelSftp) session.openChannel("sftp");
channel.connect();

Change folder:

channelSftp.cd("/a/folder");

List content of a folder:

Vector<ChannelSftp.LsEntry> entries = channelSftp.ls(folder);

Download file:

channelSftp.get(String fileNameInFtp, String  destinationFile);

Upload File:

channelSftp. put(String src, String dst) //default is overwrite
channelSftp. put(String src, String dst, int mode)
Upload Modes:
public static final int
OVERWRITE=0;
public static final int RESUME=1;
public static final int APPEND=2;

A Complete Example Code to download files from FTP:

In this example, we are using a publicly available ftp server  as described in https://test.rebex.net/
 import com.jcraft.jsch.*;  
 import java.io.File;  
 import java.util.*;  
 public class JschDownload {  
public static void main(String[] args) { Session session = null; ChannelSftp channel = null; try { JSch jsch = new JSch(); session = jsch.getSession("demo", "test.rebex.net", 22); session.setPassword("password");
//to prevent following exception for sftp //com.jcraft.jsch.JSchException: UnknownHostKey: test.rebex.net. RSA key fingerprint is .. Properties config = new Properties(); config.put("StrictHostKeyChecking", "no"); session.setConfig(config); session.connect(); System.out.println("session connected");
//various channels are supported eg: shell, x11, channel = (ChannelSftp) session.openChannel("sftp"); channel.connect(); System.out.println("channel connected");
downloadFromFolder(channel, "/"); downloadFromFolder(channel, "/pub/example/");
     //in order to download all files including sub-folders/sub-sub-folder, we should iterate recursively System.out.println("File Uploaded to FTP Server Successfully.");
} catch (Exception e) { e.printStackTrace(); } finally { if (channel != null) { channel.disconnect(); } if (channel != null) { session.disconnect(); } } } static void downloadFromFolder(ChannelSftp channelSftp, String folder) throws SftpException { Vector<ChannelSftp.LsEntry> entries = channelSftp.ls(folder); new File("download").mkdir();
//download all files (except the ., .. and folders) from given folder for (ChannelSftp.LsEntry en : entries) { if (en.getFilename().equals(".") || en.getFilename().equals("..") || en.getAttrs().isDir()) { continue; }
System.out.println("Downloading " + (folder + en.getFilename()) + " ----> " + "download" + File.separator + en.getFilename()); channelSftp.get(folder + en.getFilename(), "download" + File.separator + en.getFilename()); } } }


GraalVM setup and generate native image using maven plugin

GraalVM Setup and native image generation using Maven plugin

Today we are going to generate native image(one of many features of GraalVM) using GraalVM for the XML Parser that we developed earlier. Native image will contain the whole program in machine code ready for its immediate execution. It has the following advantages:
Ref: https://www.graalvm.org/docs/why-graal/#create-a-native-image
  • faster startup time
  • no need for JVM(JDK/JRE) to execute the application
  • low memory footprint
 Steps:

1) GraalVM setup

I used sdkman to install GraalVM SDK setup in my Linux machine. I used the following steps. First I listed all available JDK distributions and then I ran sdk install to install the latest GraalVM version. At the end of the installation I selected Yes to enable this version as default JDK.

sdk list java
sdk install java 20.1.0.r11-grl 

Then I verified the installation using following
java -version 

I got the following. So, everything working great so far:
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02)
OpenJDK 64-Bit Server VM GraalVM CE 20.1.0 (build 11.0.7+10-jvmci-20.1-b02, mixed mode, sharing)


If you want to do it manually, download the zip file and extract it and add to system path and enable that as default JDK.

2) Native image tools installation

Before you can use GraalVM native image utility,  you need to have a working C developer environment. For this:

- On Linux, you will need GCC, and the glibc and zlib headers. 
Examples for common distributions:

    # dnf (rpm-based)
    sudo dnf install gcc glibc-devel zlib-devel libstdc++-static
    # Debian-based distributions:
    sudo apt-get install build-essential libz-dev zlib1g-dev

- On MacOS
    XCode provides the required dependencies on macOS:

    xcode-select --install

- On Windows, you will need to install the Visual Studio 2017 Visual C++ Build Tools


After this, you can run the following to install the native-image utility
$JAVA_HOME/bin/gu install native-image  

Here, $JAVA_HOME is your GraalVM installation directory

3) Finally, use GraalVM native image Maven plugin to generate native-image during package phase

For this, I added the following on my XML Parser's pom.xml file:  

Dependency:
<dependency>
    <groupId>org.graalvm.sdk</groupId>
    <artifactId>graal-sdk</artifactId>
    <version>${graalvm.version}</version>
    <scope>provided</scope>
</dependency>



Plugin: It automatically detects the jar file and the main class from the jar file. I've specified the imageName = xmltocsv as the executable

<plugin>
    <groupId>org.graalvm.nativeimage</groupId>
    <artifactId>native-image-maven-plugin</artifactId>
    <version>${graalvm.version}</version>
    <executions>
        <execution>
            <goals>
                <goal>native-image</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <!--The plugin figures out what jar files it needs to pass to the native image
        and what the executable main class should be. -->
        <!--<mainClass>${app.mainClass}</mainClass>-->
        <imageName>xmltocsv</imageName>
        <buildArgs>
            --no-fallback
        </buildArgs>
        <skip>false</skip>
    </configuration>
</plugin>


The version:
<graalvm.version>20.1.0</graalvm.version>

And ran with following to generate the native image
mvnw clean package native-image:native-image




 It produced the following files on my target folder

── target
│   ├── xmltocsv   //this is the binary file, it can run without jvm
│   └── xmltocsv-FINAL.jar  //this required JVM to run


4) Testing

In my linux machine I executed the xmltocsv binary
$ ./target/xmltocsv ../big3.xml ../big3.csv

It started faster, used less memory but took little longer(because we lost JVM optimizations) to convert the file than running the jar file to do the same.

The complete example code is available here: https://github.com/gtiwari333/java-read-big-xml-to-csv

java read huge xml file and convert to csv

SAX parser uses event handler org.xml.sax.helpers.DefaultHandler to efficiently parse and handle the intermediate results of an XML file.  

It provides the following three important methods on each event where we can write custom logic to take specific action at each events:
  • startDocument() and endDocument() – Method called at the start and end of an XML document. 
  • startElement() and endElement() – Method called at the start and end of a document element.  
  • characters() – Method called with the text contents in between the start and end tags of an XML document element.
We will be using this class to read a HUGE xml file (6.58GB, it should support any size without any problem) efficiently and convert and write to CSV file.

I am going to use my existing code from my old blog xml-parsing-using-saxparser and updating it for this purpose. The final code is available on github project java-read-big-xml-to-csv


Java HUGE XML to CSV - project structure

How to Import/Run:

Its a simple maven project(with no dependencies). You can import it into your IDE or  use command line to compile and run.
If you plan on using Command Line, to compile and create a runnable jar file, go to the root of the project and run mvnw clean package .
Then you can run the executable as following:
java -jar target\xmltocsv-FINAL.jar  C:\folder\input.xml  C:\folder\output.csv

The code:

SaxParseEventHandler 
SaxParseEventHandler class takes the RecordWriter as constructor parameter
public SaxParseEventHandler(RecordWriter<Book> writer) {


We create new book record on startElement event
public void startElement(String s, String s1, String elementName, Attributes attributes) { /* handle start of a new Book tag and attributes of an element */ if (elementName.equalsIgnoreCase("book")) { //start bookTmp = new Book();


and we write the parsed book data to file on endElement() event.
public void endElement(String s, String s1, String element) { if (element.equals("book")) { //end writer.write(bookTmp, counter);





RecordWriter:
Its a simple wrapper for FileWriter to write content to file. We are currently writing T.toString() to file.
public void write(T t, int n) throws IOException { fw.write(t.toString()); if (n % 10000 == 0) { fw.flush(); } }

Main:
Its the main 'launcher' class
SAXParserFactory factory = SAXParserFactory.newInstance(); try (RecordWriter<Book> w = new RecordWriter<>(outputCSV)) { SAXParser parser = factory.newSAXParser(); parser.parse(inputXml, new SaxParseEventHandler(w)); }






Results at 16GB RAM, Core i5, 6MB L3 cache, SSD | Windows Machine
Max RAM usage: 190MB
Time Taken:
For the file big2.xml with size 118MB
- JDK8 - 8-9 sec
- JDK 11 - 6-7 sec
- JDK 14 - 5 sec 

big3.xml with size 6.58GB takes about 2 minutes


Next Steps: create a binary using GraalVM. I will keep posting !!

Java Compress/Decompress String/Data

Java provides the Deflater class for general purpose compression using the ZLIB compression library. It also provides the DeflaterOutputStream which uses the Deflater class to filter a stream of data by compressing (deflating) it and then writing the compressed data to another output stream. There are equivalent Inflater and InflaterOutputStream classes to handle the decompression.

Compression


Here is an example of how to use the DeflatorOutputStream to compress a byte array.
static byte[]compressBArray(byte[]bArray) throws IOException{
        ByteArrayOutputStream os=new ByteArrayOutputStream();
        try(DeflaterOutputStream dos=new DeflaterOutputStream(os)){
            dos.write(bArray);
        }
        return os.toByteArray();
}

Let's test:

byte[] input = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"        .getBytes();
byte[] op = CompressionUtil.compressBArray(input);
System.out.println("original data length " + input.length +
        ",  compressed data length " + op.length);

This results 'original data length 71,  compressed data length 12'

Decompression

Let's test:

public static byte[] decompress(byte[] compressedTxt) throws IOException {
        ByteArrayOutputStream os = new ByteArrayOutputStream();    
        try (OutputStream ios = new InflaterOutputStream(os)) {
            ios.write(compressedTxt);    
        }
        return os.toByteArray();
}
This prints the original 'input' string.


Let's convert the byte[] to Base64 to make it portable

In the above examples we are getting the compressed data in byte array format (byte []) which is an array of numbers.

But we might want to transmit the compressed data to a file or json or db right? So, in order to transmit, we can convert it to Base64 using the following

byte[] bytes = {}; //the byte array    
String b64Compressed = new String(Base64.getEncoder().encode(bytes));
byte[] decompressedBArray = Base64.getDecoder().decode(b64Compressed);
//convert to original string if input was string
new String(decompressedBArray, StandardCharsets.UTF_8);

Here's the complete code and the test cases

package compress;

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.charset.StandardCharsets;
import java.util.Base64;
import java.util.zip.DeflaterOutputStream;
import java.util.zip.InflaterOutputStream;

public class CompressionUtil {

    public static String compressAndReturnB64(String text) throws IOException {
        return new String(Base64.getEncoder().encode(compress(text)));
    }

    public static String decompressB64(String b64Compressed) throws IOException {
        byte[] decompressedBArray = decompress(Base64.getDecoder().decode(b64Compressed));
        return new String(decompressedBArray, StandardCharsets.UTF_8);
    }

    public static byte[] compress(String text) throws IOException {
        return compress(text.getBytes());
    }

    public static byte[] compress(byte[] bArray) throws IOException {
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        try (DeflaterOutputStream dos = new DeflaterOutputStream(os)) {
            dos.write(bArray);
        }
        return os.toByteArray();
    }

    public static byte[] decompress(byte[] compressedTxt) throws IOException {
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        try (OutputStream ios = new InflaterOutputStream(os)) {
            ios.write(compressedTxt);
        }
        return os.toByteArray();
    }

}

Test case:

package compress;

import org.junit.jupiter.api.Test;

import java.io.IOException;
import java.nio.charset.StandardCharsets;

public class CompressionTest {

    String testStr = "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";

    @Test
    void compressByte() throws IOException {
        byte[] input = testStr.getBytes();
        byte[] op = CompressionUtil.compress(input);
        System.out.println("original data length " + input.length + ",  compressed data length " + op.length);
        byte[] org = CompressionUtil.decompress(op);
        System.out.println(org.length);
        System.out.println(new String(org, StandardCharsets.UTF_8));
    }

    @Test
    void compress() throws IOException {

        String op = CompressionUtil.compressAndReturnB64(testStr);
        System.out.println("Compressed data b64" + op);
        String org = CompressionUtil.decompressB64(op);
        System.out.println("Original text" + org);
    }

}


 Note: Since the compress and decompress method operate on byte[], we can compress/decompress any data type.

AWS DynamoDB - dynamic table prefix using DynamoDBMapper


We can use DynamoDBMapperConfig.TableNameOverride to configure the DynamoDBMapper and provide a custom/dynamic table name prefix using TableNameOverride.withTableNamePrefix(String).


Plain Java Example:

import com.amazonaws.services.dynamodbv2.*;
import com.amazonaws.services.dynamodbv2.datamodeling.*;

import java.util.UUID;

//code:

String prefix = "SOME_DYNAMIC_PREFIX"; //can be pulled from a dynamic logic eg: profile, env variable etc
var mapperConfig = new DynamoDBMapperConfig.Builder()
.withTableNameOverride(DynamoDBMapperConfig.TableNameOverride.withTableNamePrefix(prefix + "-"))
.build();

var dynamoDB = AmazonDynamoDBClientBuilder.standard().build();
var dbMapper = new DynamoDBMapper(dynamoDB, mapperConfig);


// use it
dbMapper.load(MyTable.class, UUID.randomUUID());

Spring DynamoDB dynamic table prefix example



import com.amazonaws.services.dynamodbv2.*;
import com.amazonaws.services.dynamodbv2.datamodeling.*;
import org.springframework.context.annotation.*;
import java.util.UUID;

@Configuration
class AwsConfig {
@Bean
AmazonDynamoDB dynamoDB() {
return AmazonDynamoDBClientBuilder.standard().build();
}

@Bean
DynamoDBMapperConfig dynamoDBMapperConfig() {
String prefix = "SOME_DYNAMIC_PREFIX"; //can be pulled from a dynamic logic eg: profile, env variable etc
return new DynamoDBMapperConfig.Builder()
.withTableNameOverride(DynamoDBMapperConfig.TableNameOverride.withTableNamePrefix(prefix + "-"))
.build();
}

@Bean
DynamoDBMapper dynamoDBMapper(AmazonDynamoDB dynamoDB, DynamoDBMapperConfig dynamoDBMapperConfig)
{
return new DynamoDBMapper(dynamoDB, dynamoDBMapperConfig);
}
}


import com.amazonaws.services.dynamodbv2.datamodeling.*;
import java.util.UUID;
@DynamoDBTable(tableName = "person")
public class MyTable {
@DynamoDBHashKey
@DynamoDBAutoGeneratedKey
UUID id;

String name;
//getter setter/other fields

Web Scrapping in Java using JSoup

Example of Web Scrapping in Java using JSoup

In this blog I'm going to describe how we can use JSoup library to scrap content from a website. The websites uses a standard markup called HTML to display documents in a web browser. They contain XML like document structure composed of elements and attributes.

<rootElement> //element with tag rootElement

   <aTag width="10" height="20" color="RED"> //sub element aTag with attributes width, height etc

        <content>Hello</content>  //another nested sub element

    </aTag>

    <summary> This is summary.</summary> //another element under root element

</rootElement>

Although a HTML document starts with <HTML> and the content are kept under <BODY> element, the actual semantics of HTML is irrelevant to web Scrapping because HTML is really an XML document. All the web scrapping libraries deals with parsing the XML and reading the data out of the XML document.

Let's build a Quotes scrapping app!

In this example we are going to extract quotes from goodreads.com(https://www.goodreads.com/quotes. 

Step 1: Setup a skeleton Java Project with JSoup dependency

We are going to use Maven to add the JSoup dependency and build the project.

Step 1.a Generate Maven Project using maven archetype

mvn archetype:generate -DgroupId=gt  -DartifactId=web-scrapper-java    -DarchetypeArtifactId=maven-archetype-quickstart   -DinteractiveMode=false 

It generated the following files. Note that I deleted the AppTest.java under /src/test/java/gt/ because we won't be writing unit tests for this app.

├── pom.xml
├── src
│   └── main
│       └── java
│           └── gt
│               ├── App.java

Step 1.b Add JSoup dependency

I searched for jsoup dependency at https://mvnrepository.com/artifact/org.jsoup/jsoup and copied the following definition for the current version of jsoup and pasted inside <dependency> section


<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.13.1</version> <!-- use the new version -->
</dependency>

I also deleted junit dependency from pom.xml since we won't be writing unit tests.

Step 2: Basic Scrapping Examples

Let's play with JSoup API first. See the examples below. Here we are parsing XML content from string and extracting several pieces of the content using cssQuery. Please refer to https://www.w3schools.com/cssref/css_selectors.asp for more examples of css query.


import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import static java.lang.System.out;

public class Test {


public static void main(String[] args) {

String html = "<rootElement> " +
" <aTag width='10' height='20' color='RED' class='C1'> " +
" <content>Hello</content> " +
" </aTag>" +
" <aTag width='10' height='20' color='GREEN' class='C1'> " +
" <content class = 'small-font'>Hello Again small font</content> " +
" </aTag>" +
" <summary>" +
" <content class = 'small-font'> This is summary in small font </content>" +
" </summary> " +
"</rootElement>";

Document doc = Jsoup.parse(html);

//print all content element
/*
it prints:
Hello
Hello Again small font
This is summary in small font
*/
Elements els = doc.select("content");
for (Element e : els) {
out.println(e.text());
}

//text inside content element under aTag
/*
it prints:
Hello
Hello Again small font
*/
for (Element e : doc.select("aTag > content")) {
out.println(e.text());
}

//get all elements that have a color attribute and display the value of the attribute
/*
int prints
RED
GREEN
*/
for (Element e : doc.getElementsByAttribute("color")) {
out.println(e.attributes().get("color"));
}

//get all elements that have a attribute class = C1 attribute and display the value of the attribute
/*
int prints
RED
GREEN
*/
for (Element e : doc.select(".C1")) {
out.println(e.attributes().get("color"));
}

//read text inside a tag
/*
it prints:
Hello Again small font
This is summary in small font
*/
for (Element e : doc.select(".small-font")) {
out.println(e.text());
}

}
}


Step 3: Scrapping goodreads.com

Step 3.a Examine the html content

The first step is to examine the structure of the document to see where our data is located. Here we want to read the quote, author and the tags.

After inspecting the structure of the HTML through the inspect tool on browser, we can notice that:

  • The <div class='quote'> is repeated for each Quote.
  • The text inside 'quoteText' class. 
  • Author name is inside authorOrTitle class under the quoteText class.
  • Tags are inside the 'quoteFooter' class

Here's the html content we are interested in. We want to extract the text in red.
<div class="quoteText">
      “I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.
  <br>  ―
  <span class="authorOrTitle">
    Marilyn Monroe
  </span>
</div>
<div class="quoteFooter">
   <div class="greyText smallText left">
     tags:
       <a href="/quotes/tag/attributed-no-source">attributed-no-source</a>,
       <a href="/quotes/tag/best">best</a>,
       <a href="/quotes/tag/life">life</a>,
       <a href="/quotes/tag/love">love</a>,
       <a href="/quotes/tag/mistakes">mistakes</a>,
       <a href="/quotes/tag/out-of-control">out-of-control</a>,
       <a href="/quotes/tag/truth">truth</a>,
       <a href="/quotes/tag/worst">worst</a>
   </div>
   <div class="right">
     <a class="smallText" title="View this quote" href="/quotes/8630-i-m-selfish-impatient-and-a-little-insecure-i-make-mistakes">151963 likes</a>
   </div>
</div>
 

Step 3.b Read quotes from goodreads.com

In the above example we used a static String to parse. We can use Jsoup.connect(THE URL).get() to read a webpage and get the Document object as below:

Document doc = Jsoup.connect("https://www.goodreads.com/quotes?page=1").get();

The full code to read quote text, author and tags

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import java.io.IOException;
import java.util.List;
import java.util.stream.Collectors;

public class GoodReadsScrapper {

public static void main(String[] args) throws IOException {
Document doc = Jsoup.connect("https://www.goodreads.com/quotes?page=1").get();

Elements quoteElements = doc.select(".quoteText");

for (Element e : quoteElements) {

//read quote text and the author from the body of quoteText css
//e.text() returns all the visible text inside this element which also includes the author... use ownText to not look at child elements
String qStr = e.ownText();
String quoteText = qStr.replaceAll("“", "").replaceAll("”", "");

//author is inside span inside authorOrTitle class within the current element
String author = e.select(".authorOrTitle").text();

//Tags: read sibling element of div with class 'quoteText', choose the one with class 'quoteFooter' and read the a tags
Elements tagElements = e.nextElementSiblings().select(".quoteFooter").select(".greyText").select("a");
List<String> tags = tagElements.stream().map(Element::text).collect(Collectors.toList());

System.out.println(quoteText + " By:" + author + " , Tags:" + tags);
}
}

}

 

Step 4: Thinking Bigger:

What if we want to read quotes from multiple web sites?

What if we want to store the quotes to DB?

What if we want to run the scrapping job periodically?

For these 'what-ifs', I updated the above code to include following:

├── pom.xml
├── src
│   └── main
│       └── java
│           └── gt
│               ├── GoodReadsScrapper.java //implementation for GoodReads
│               ├── Quote.java //wrapper class to hold quote data
│               ├── QuoteScrapper.java  //base interface
│               ├── ScrapperService.java //a job
│               ├── Source.java //enum to hold sources

The source is available at https://github.com/gtiwari333/java-web-scrapping-jsoup

 

A bigger (web app) application that uses Spring Boot, Angular  is available here: https://github.com/gtiwari333/spring-boot-keycloak-angular-quote-app