Thursday, June 30, 2022

Read URL Contents using HttpURLConnection

Working with HttpURLConnection class

HttpURLConnection class is used to establish a connection to http end points. By using HttpURLConnection class, you can open a connection to given http end point, read the content, write content, read and write headers, read response status codes etc., In this tutorial series, you will get hands on of all these things.

·       Read URL Contents using HttpURLConnection
·       Get headers from url using HttpURLConnection
·       Adding headers using HttpURLConnection
·       Post data to url using HttpURLConnection

Follow below steps to read the contents from an url.


Step 1: Create a connection to the url.
                 URL url = new URL(urlToConnect);
                 HttpURLConnection httpUrlConnection = (HttpURLConnection) url.openConnection();
                
Step 2: Create input stream from the url connection.

                 int responseCode = httpUrlConnection.getResponseCode();
                 InputStream inputStream = null;

                 if (responseCode >= 200 && responseCode < 400) {
                          inputStream = httpUrlConnection.getInputStream();
                 } else {
                          inputStream = httpUrlConnection.getErrorStream();
                 }

Step 3: Read the content from input stream and print them to console.

                 BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));

                 String line = null;

                 while ((line = br.readLine()) != null) {
                          System.out.println(line);
                 }


Find below working application.


Test.java
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;

public class Test {

 private static HttpURLConnection getURLConnection(String urlToConnect) throws IOException {
  URL url = new URL(urlToConnect);
  HttpURLConnection httpUrlConnection = (HttpURLConnection) url.openConnection();
  return httpUrlConnection;
 }

 private static InputStream getContent(HttpURLConnection httpUrlConnection) throws IOException {
  int responseCode = httpUrlConnection.getResponseCode();
  InputStream inputStream = null;

  if (responseCode >= 200 && responseCode < 400) {
   inputStream = httpUrlConnection.getInputStream();
  } else {
   inputStream = httpUrlConnection.getErrorStream();
  }

  return inputStream;
 }

 private static void printInputStream(InputStream inputStream) throws IOException {
  BufferedReader br = new BufferedReader(new InputStreamReader(inputStream));

  String line = null;

  while ((line = br.readLine()) != null) {
   System.out.println(line);
  }
 }

 public static void main(String[] args) throws IOException {
  String url = "https://self-learning-java-tutorial.blogspot.com/2016/05/java-home-page.html";
  HttpURLConnection httpUrlConnection = getURLConnection(url);
  InputStream inputStream = getContent(httpUrlConnection);
  printInputStream(inputStream);

 }

}

HttpURLConnection: work with HTTP urls

HttpURLConnection is a sub class of URLConnection class, and provides some extra functionality to work with http urls.


How to get instance of HttpURLConnection object
Since HttpURLConnection is an abstract class, we can’t instantiate it directly. We can instantiate it, by calling openConnection() method of URL class.
/**
   * 
   * @param resource Resource to connect to
   * @return HttpURLConnection object for this resource
   */
  public static HttpURLConnection getConnection(String resource) {

    URL url = null;
    HttpURLConnection connection = null;
    try {
      url = new URL(resource);
      connection = (HttpURLConnection) url.openConnection();
    } catch (MalformedURLException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }

    return connection;
  }

By default HttpURLConnection uses GET method, you can change the request method by using setRequest() method.

public void setRequestMethod(String method) throws ProtocolException
Set the method for the URL request, method is one of GET, POST, HEAD, OPTIONS, PUT, DELETE, TRACE


Following application gets the headers of given resource.

import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.ProtocolException;
import java.net.URL;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class HttpURLConnectionUtil {

  /**
   * 
   * @param resource
   *            Resource to connect to
   * @return HttpURLConnection object for this resource
   */
  public static HttpURLConnection getConnection(String resource) {

    URL url = null;
    HttpURLConnection connection = null;
    try {
      url = new URL(resource);
      connection = (HttpURLConnection) url.openConnection();
    } catch (MalformedURLException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }

    return connection;
  }

  /**
   * @param connection
   * @return map of header fields
   */
  public static Map<String, List<String>> getHeaders(
      HttpURLConnection connection) {

    try {
      connection.setRequestMethod("HEAD");
      Map<String, List<String>> headersMap = connection.getHeaderFields();
      return headersMap;

    } catch (ProtocolException e) {
      e.printStackTrace();
    }

    return new HashMap<String, List<String>>();
  }

}


import java.net.HttpURLConnection;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class Main {

  public static void main(String args[]) {
    String resource = "https://self-learning-java-tutorial.blogspot.com";

    HttpURLConnection connection = HttpURLConnectionUtil
        .getConnection(resource);
    Map<String, List<String>> headersMap = HttpURLConnectionUtil
        .getHeaders(connection);

    Set<String> headersSet = headersMap.keySet();
    for (String header : headersSet) {
      System.out.println(header + " " + headersMap.get(header));
    }
    
    connection.disconnect();

  }
}


Sample Output
null [HTTP/1.1 200 OK]
ETag ["48d68749-2249-4b04-8401-c716a53d1936"]
Date [Fri, 22 May 2015 06:25:19 GMT]
Content-Length [0]
X-XSS-Protection [1; mode=block]
Expires [Fri, 22 May 2015 06:25:19 GMT]
Alternate-Protocol [80:quic,p=0]
Last-Modified [Fri, 22 May 2015 03:51:54 GMT]
Content-Type [text/html; charset=UTF-8]
Server [GSE]
X-Content-Type-Options [nosniff]
Cache-Control [private, max-age=0]

URLConnection : Writing Data to server

By using URLConnection, you can write data to a server. Once the connection is established, get the output stream of URLConnection object and write data to server.


public OutputStream getOutputStream() throws IOException
Returns an output stream that writes to this connection.

Note:
URLConnection don’t allow writing data to server by default, you have to call setDoOutput(true) to write data to server.

public void setDoOutput(boolean doOutput)
Set the DoOutput flag to true if you intend to use the URL connection for output, false if not. The default is false.

import java.io.BufferedOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.net.URL;
import java.net.URLConnection;

public class URLConnectionUtil {

  public static void writeDataToServer(String url, String data) {
    try {
      URL u = new URL(url);

      URLConnection urlConnection = u.openConnection();
      urlConnection.setDoOutput(true);
      
      OutputStream outputStream = urlConnection.getOutputStream();
      OutputStream buffered = new BufferedOutputStream(outputStream);
      OutputStreamWriter out = new OutputStreamWriter(buffered, "8859_1");
      
      out.write(data);
      
      out.flush();
      
      out.close();
    } catch (IOException ex) {
      System.err.println(ex);
    }
  }
}

URLConnection : getHeaderFieldKey : Get the nth header field

public String getHeaderFieldKey(int n)

Returns the key for the nth header field. The request method is header zero and has a null key.

import java.net.URL;
import java.net.URLConnection;

public class Main {

    public static void main(String args[]) throws Exception {
        String resource = "https://self-learning-java-tutorial.blogspot.com";

        /* Construct URL object */
        URL url = new URL(resource);

        /* Open URLConnection to this URL */
        URLConnection conn = url.openConnection();

        for (int i = 1;; i++) {
            String headerField = conn.getHeaderFieldKey(i);
            if (headerField == null) {
                break;
            }
            System.out.println(headerField + " : " + conn.getHeaderField(i));
        }
    }
}

Sample Output
Content-Type : text/html; charset=UTF-8
Expires : Tue, 19 May 2015 15:30:48 GMT
Date : Tue, 19 May 2015 15:30:48 GMT
Cache-Control : private, max-age=0
Last-Modified : Mon, 18 May 2015 10:26:27 GMT
X-Content-Type-Options : nosniff
X-XSS-Protection : 1; mode=block
Server : GSE
Alternate-Protocol : 80:quic,p=1
Accept-Ranges : none
Vary : Accept-Encoding
Transfer-Encoding : chunked


Note
public String getHeaderField(int n)
Returns the value for the nth header field.

URLConnection : Get specific header field

 URLConnection class provides getHeaderField method, by using this you can get the value of specific header field.


public String getHeaderField(String name)
Returns the value of the named header field, or null if there is no such field in the header.

import java.net.URL;
import java.net.URLConnection;

public class Main {

  public static void main(String args[]) throws Exception {
    String resource = "https://self-learning-java-tutorial.blogspot.com";

    /* Construct URL object */
    URL url = new URL(resource);

    /* Open URLConnection to this URL */
    URLConnection conn = url.openConnection();

    String contentType = conn.getHeaderField("Content-Type");
    String tranferEncoding = conn.getHeaderField("Transfer-Encoding");
    String lastModified = conn.getHeaderField("Last-Modified");

    System.out.println("contentType : " + contentType);
    System.out.println("tranferEncoding : " + tranferEncoding);
    System.out.println("lastModified : " + lastModified);
  }
}

Sample Output
contentType : text/html; charset=UTF-8
tranferEncoding : chunked
lastModified : Mon, 18 May 2015 10:26:27 GMT

URLConnection : Get all header fields and values

URLConnection class provides getHeaderFields method, which returns a Map of header field.


public Map<String,List<String>> getHeaderFields()
Returns a map of header fields and respective values.

import java.net.URL;
import java.net.URLConnection;
import java.util.List;
import java.util.Map;
import java.util.Set;

public class Main {

  public static void main(String args[]) throws Exception {
    String resource = "https://self-learning-java-tutorial.blogspot.com";

    /* Construct URL object */
    URL url = new URL(resource);

    /* Open URLConnection to this URL */
    URLConnection conn = url.openConnection();

    Map<String, List<String>> headerFields = conn.getHeaderFields();

    Set<String> keys = headerFields.keySet();

    for (String key : keys) {
      List<String> values = headerFields.get(key);
      System.out.print(key + ": ");
      for (String value : values) {
        System.out.print(value + "\t");
      }
      System.out.println();
    }
  }
}

Sample Output

null: HTTP/1.1 200 OK  
Expires: Tue, 19 May 2015 15:17:07 GMT  
X-XSS-Protection: 1; mode=block 
Last-Modified: Mon, 18 May 2015 10:26:27 GMT  
Alternate-Protocol: 80:quic,p=1 
Server: GSE 
X-Content-Type-Options: nosniff 
Cache-Control: private, max-age=0 
Date: Tue, 19 May 2015 15:17:07 GMT 
Vary: Accept-Encoding 
Transfer-Encoding: chunked  
Content-Type: text/html; charset=UTF-8  
Accept-Ranges: none 

URLConnection : read the headers

By using URLConnection object, you can query headers. Following getter methods provided by URLConnection class, to query for header information.


public String getContentType()
Returns the value of the content-type header field.

public int getContentLength()
Returns the value of the content-length header field. Returns the content length in bytes. Returns -1 if the content length is not known, or if the content length is greater than Integer.MAX_VALUE.

public long getContentLengthLong()
Returns the value of the content-length header field. Returns the content length in bytes. Returns -1 if the content length is not known. This method is used to know the length of large files (where size exceeds integer).

public String getContentEncoding()
Returns the value of the content-encoding header field. It returns null if the content encoding not known.

public long getDate()
Returns the sending date of the resource in milliseconds.

public long getExpiration()
Returns the expiration date of the resource.

import java.net.URL;
import java.net.URLConnection;
import java.util.Date;

public class Main {

  public static void main(String args[]) throws Exception {
    String resource = "https://self-learning-java-tutorial.blogspot.com";

    /* Construct URL object */
    URL url = new URL(resource);

    /* Open URLConnection to this URL */
    URLConnection conn = url.openConnection();

    String contentType = conn.getContentType();
    int conLengthInt = conn.getContentLength();
    long contentLengthLong = conn.getContentLengthLong();
    String contentEncoding = conn.getContentEncoding();

    long dateInMillis = conn.getDate();
    Date documentSent = new Date(dateInMillis);

    long expirationMillis = conn.getExpiration();
    Date expireDate = new Date(expirationMillis);

    long lastModifiedMills = conn.getLastModified();
    Date lastModifiedDate = new Date(lastModifiedMills);

    System.out.println("contentType = " + contentType);
    System.out.println("conLengthInt = " + conLengthInt);
    System.out.println("contentLengthLong = " + contentLengthLong);
    System.out.println("contentEncoding = " + contentEncoding);
    System.out.println("documentSent = " + documentSent);
    System.out.println("expireDate = " + expireDate);
    System.out.println("lastModifiedDate = " + lastModifiedDate);

  }
}


Sample Output

contentType = text/html; charset=UTF-8
conLengthInt = -1
contentLengthLong = -1
contentEncoding = null
documentSent = Tue May 19 20:40:26 IST 2015
expireDate = Tue May 19 20:40:26 IST 2015
lastModifiedDate = Mon May 18 15:56:27 IST 2015

URLConnection : Reading data from a server

It is a four step process.

1. Construct URL Object
         URL url = new URL(resource);

2. Open connection to URL object
         URLConnection conn = url.openConnection();

3. Get the input stream from URL connection


4. Read data and close the connection
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URL;
import java.net.URLConnection;

public class Main {

  public static String readData(String resource) {
    try {
      /* Construct URL object */
      URL url = new URL(resource);

      /* Open URLConnection to this URL */
      URLConnection conn = url.openConnection();

      /* Get input Stream for the connection */
      InputStream is = conn.getInputStream();
      BufferedReader br = new BufferedReader(new InputStreamReader(is));

      /* Read data from connection and print */
      String str;
      while ((str = br.readLine()) != null) {
        System.out.println(str);
      }
      
      br.close();

    } catch (MalformedURLException e) {
      e.printStackTrace();
    } catch (IOException e) {
      e.printStackTrace();
    }

    return null;
  }

  public static void main(String args[]) throws UnsupportedEncodingException {
    String resource = "https://self-learning-java-tutorial.blogspot.com";
    System.out.println(readData(resource));
  }
}

Note
Invoking the close() methods on the InputStream or OutputStream of an URLConnection after a request may free network resources associated with this instance.

Tuesday, June 28, 2022

Java URI class

URI stands for Uniform Resource Identifier. A uniform resource identifier (URI) is a string of characters used to identify a name of a resource.


URI’s are defined as two types URL and URN.

Uniform Resource Locator (URL): This is an address used to identify network/resource locations.
Example: http://gmail.com

Uniform Resource Name(URN): This is persistent name, which is address independent. A URN can be used to identify a resource without implying its location or how to access it.

Example: URN:ISBN:0-395-36341-1

Above one specifies the unique reference within the International Standard Book Number (ISBN) identifier system. It references a resource, but doesn’t specify how to obtain an actual copy of the book.

Usually, you should use the URL class when you want to download the content at a URL and the URI class when you want to use the URL for identification rather than retrieving the content.

Following is the general syntax for URI.
[scheme:]scheme-specific-part[#fragment]

URI class provides following constructors to initialize URI object.

URI(String str)
URI(String scheme, String ssp, String fragment)
URI(String scheme, String userInfo, String host, int port, String path, String query, String fragment)
URI(String scheme, String host, String path, String fragment)
URI(String scheme, String authority, String path, String query, String fragment)

URI(String str)
Construts URI, by parsing given string. If the string passed to this constructor, is not followed URI standrads, then it throws URISyntaxException.

import java.net.URI;
import java.net.URISyntaxException;

public class Main {
  public static void main(String args[]) throws URISyntaxException {
    URI uri1 = new URI("http://www.google.com");
    URI uri2 = new URI("URN:ISBN:0-395-36341-1");

    System.out.println(uri1);
    System.out.println("Authority : " + uri1.getAuthority());
    System.out.println("Fragment : " + uri1.getFragment());
    System.out.println("Host : " + uri1.getHost());
    System.out.println("Scheme : " + uri1.getScheme());
    System.out.println("********************************");

    System.out.println(uri2);
    System.out.println("Authority : " + uri2.getAuthority());
    System.out.println("Fragment : " + uri2.getFragment());
    System.out.println("Host : " + uri2.getHost());
    System.out.println("Scheme : " + uri2.getScheme());
  }
}


Output

http://www.google.com
Authority : www.google.com
Fragment : null
Host : www.google.com
Scheme : http
********************************
URN:ISBN:0-395-36341-1
Authority : null
Fragment : null
Host : null
Scheme : URN
null


URI(String scheme, String ssp, String fragment)
Constructs URI from the components scheme (http, ftp, smtp etc., ), ssp (Scheme Specific part), fragment. Final result like “scheme:ssp#fragment”.

import java.net.URI;
import java.net.URISyntaxException;

public class Main {
  public static void main(String args[]) throws URISyntaxException {
    URI uri1 = new URI("http", "//www.google.com", "search1");

    System.out.println(uri1);
    System.out.println("Authority : " + uri1.getAuthority());
    System.out.println("Fragment : " + uri1.getFragment());
    System.out.println("Host : " + uri1.getHost());
    System.out.println("Scheme : " + uri1.getScheme());
  }
}


Output

http://www.google.com#search1
Authority : www.google.com
Fragment : search1
Host : www.google.com
Scheme : http


public URI(String scheme, String host, String path, String fragment) throws URISyntaxException
Constructs URI from the components scheme (http, ftp, smtp etc., ), host (host name), path, fragment. Final result like “scheme:host/path#fragment”.

import java.net.URI;
import java.net.URISyntaxException;

public class Main {
  public static void main(String args[]) throws URISyntaxException {
    URI uri = new URI("http", "www.docs.oracle.com", "/javase/7/docs/api/java/net/URI.html", "URI");
    
    System.out.println(uri);
    System.out.println("Authority : " + uri.getAuthority());
    System.out.println("Fragment : " + uri.getFragment());
    System.out.println("Host : " + uri.getHost());
    System.out.println("Scheme : " + uri.getScheme());
  }
}


Output
http://www.docs.oracle.com/javase/7/docs/api/java/net/URI.html#    

Java NIO - Overview

Java IO (Input/Output) is used to perform read and write operations. The  java.io package  contains all the classes required for input and o...