In my current project, I had a requirement to read the WSDL file from the URL and store it into the database as CLOB.
There was no validation required, so it was kind of reading URL content to String and then storing it into the database table.
Java Read URL to String
Here is the program I wrote in Java to read URL to String.
package com.journaldev.java;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
public class ReadURLToString {
public static void main(String[] args) throws Exception {
URL test = new URL("https://journaldev.com");
URLConnection uc = test.openConnection();
uc.addRequestProperty("User-Agent", "Mozilla/4.0");
BufferedReader in = new BufferedReader(new InputStreamReader(uc
.getInputStream()));
String inputLine;
StringBuilder sb = new StringBuilder();
while ((inputLine = in.readLine()) != null) {
sb.append(inputLine);
System.out.println(inputLine);
}
in.close();
System.out.println("HTML Data:" + sb.toString());
}
}
When we run the above program, it produces the following output.
Most of the code is self-understood except setting the HTTP user agent.
For some websites, if you don’t set User-Agent
header, you might get 403 error code. It’s because they have web server security in place to avoid bot traffic.
If you remove the setting of User-Agent from the above program, it will produce the following error.
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: https://www.journaldev.com/
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1876)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
at ReadURLToString.main(ReadURLToString.java:12)
If you have landed here and looked for something similar, feel free to use the above code. Don’t forget to comment or share with others too. That’s all for reading URL content in java program.
Reference: Java URLConnection API Doc