We optimize, optimize and optimize again

    Due to the duty of the service, I periodically have to use a profiler, since the requirements for server performance are documented and cannot fall below a certain level. In addition to some obvious architectural changes and solutions, there are often duplicate places from module to module, from one project to another, which create additional load on the virtual machine, which I want to share.
    It just so happened that the code for working with Date most often came across because we will start from it:

    Date

    More than a dozen times, I had the opportunity to observe how during the processing of one request from the user in several different places a new date object is created. Most often, the goal is the same - to get the current time. In the simplest case, it looks like this:

        public boolean isValid(Date start, Date end) {
            Date now = new Date();
            return start.before(now) && end.after(now); 
        }
    

    It would seem - quite an obvious and correct decision. In principle, yes, with the exception of two points:
    • Using Date today in java is probably a bad idea, given the fact that almost all the methods in it are already Deprecated.
    • There is no point in creating a new date object if you can completely get by with the long primitive:

        public boolean isValid(Date start, Date end) {
            long now = System.currentTimeMillis();
            return start.getTime() < now && now < end.getTIme(); 
        }
    


    SimpleDateFormat

    Very often in web projects the task arises of translating a string into a date, or vice versa, a date into a string. The task is quite typical and most often looks like this:

        return new SimpleDateFormat("EEE, d MMM yyyy HH:mm:ss Z").parse(dateString);
    

    This is a correct and quick solution, but if the server has to parse a line for each user request in each of hundreds of streams, this can significantly affect the server performance in view of the rather heavyweight constructor SimpleDateFormat, and in addition to the formatter itself, many other objects are created including lightweight Calendar (whose size is> 400 bytes).

    The situation could be easily resolved by making SimpleDateFormat a static field, but it is not thread safe. And in a competitive environment, you can easily catch a NumberFormatException.

    The second thought is to use synchronization. But this is still a rather dubious thing. In the case of great competition between threads, we can not only not improve performance but also worsen.

    But there are at least 2 solutions:
    • Good old ThreadLocal - create SimpleDateFormat for each thread 1 time and reuse for each subsequent request. This approach will help speed up parsing dates by 2-4 times by avoiding creating SimpleDateFormat objects for each request.
    • Joda and its thread safe counterpart to SimpleDateFormat is DateTimeFormat . Although iodine as a whole is slower than the default Java Date API in date parsing, they are on par. A few tests can be found here .


    Random

    In my projects, the task often arises of returning a random entity to the user. Typically, this kind of code looks like this:

        return items.get(new Random().nextInt(items.size()));
    

    Great, easy, fast. But, if there are a lot of calls to the method, this means the constant creation of new Random objects. What can be easily avoided:

        private static final Random rand = new Random();
        ...
        return items.get(rand.nextInt(items.size()));
    

    It would seem that here it is - the perfect solution, but here it is not so simple. Despite the fact that Random is thread safe, in a multi-threaded environment it can work slowly. But Sun Oracle has already taken care of this:

         return items.get(ThreadLocalRandom.current().nextInt(items.size()));
    

    As stated in the documentation - this is the most optimal solution for our task. ThreadLocalRandom is much more efficient than Random in a multi-threaded environment. Unfortunately, this class is only available starting from version 7 (after bug fixes , Hi TheShade ). In essence, this solution is the same as with SimpleDateFormat, only with its own personal class.

    Not null

    Many developers, avoiding null values, write something like this:

    public Item someMethod() {
        Item item = new Item();
        //some logic
        if (something) {
            fillItem(item);
        }
        return item;
    }
    

    As a result, even if something never becomes true, a huge number of objects will still be created (provided that the method is called often).

    Regexp

    If you are writing a web application, you are almost certainly using regular expressions. Typical code:

    public Item isValid(String ip) {
        Pattern pattern = Pattern.compile("xxx");
        Matcher matcher = pattern.matcher(ip);
        return matcher.matches();
    }
    

    As in the first case, as soon as a new IP address arrives, we must do validation. Again for each call - packs of new objects. In this particular case, the code can be slightly optimized:

    private static final Pattern pattern = Pattern.compile("xxx");
    public Item isValid(String ip) {
        Matcher matcher = pattern.matcher(ip);
        return matcher.matches();
    }
    

    It would be ideal to also make the creation of the matchmaker outside the method, but unfortunately it is not thread safe and you have to constantly create it. As for the single-threaded environment, there is a solution:

    private static final Pattern pattern = Pattern.compile("xxx");
    private final Matcher matcher = pattern.matcher("");
    public Item isValid(String ip) {
        matcher.reset(ip);
        return matcher.matches();
    }
    

    Which is perfect for ... right, ThreadLocal.

    Truncate date

    Another fairly common task is trimming the date by hours, days, weeks. There are so many ways to do this, from Apachev's DateUtils to your own bikes:

        public static Date truncateToHours(Date date) {
            Calendar calendar = Calendar.getInstance(TimeZone.getTimeZone("UTC"));
            calendar.setTime(date);
            calendar.set(Calendar.HOUR_OF_DAY, 0);
            calendar.set(Calendar.MINUTE, 0);
            calendar.set(Calendar.SECOND, 0);
            calendar.set(Calendar.MILLISECOND, 0);
            return calendar.getTime();
        }
    

    For example, recently, analyzing the map code of the hadoip phase, I came across these 2 lines of code that consumed 60% of the CPU:

    key.setDeliveredDateContent(truncateToHours(byPeriodKey.getContentTimestamp()));
    key.setDeliveredDateAd(truncateToHours(byPeriodKey.getAdTimestamp()));
    

    For me it was a big surprise, but the profiler is not lying. Fortunately, the map method turned out to be thread safe, and the creation of the calendar object was successfully removed outside the truncateToHours () method. Which increased the speed of the map method by 2 times.

    HashCodeBuilder

    I don’t know why, but some developers use apachevskie helper classes to generate the hashcode () and equals () method. For example:

        @Override
        public boolean equals(Object obj) {
            EqualsBuilder equalsBuilder = new EqualsBuilder();
            equalsBuilder.append(id, otherKey.getId());
            ...
        }
        @Override
        public int hashCode() {
            HashCodeBuilder hashCodeBuilder = new HashCodeBuilder();
            hashCodeBuilder.append(id);
            ...
        }
    


    This, of course, is not bad if you use these methods several times during the life of the application. But if they are called constantly, for example, for each key during the Sort phase of the hadoop job, this may well affect the speed of execution.

    Conclusion

    Why am I - no, I by no means urge to run and shovel the code in order to save on creating a couple of objects, this is information for consideration, it is likely that this will be very useful to someone. Thank you for reading.

    Also popular now: