Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Java Java Data Structures - Retired Organizing Data Splitting Strings

Xuanzheng Lin
Xuanzheng Lin
2,466 Points

What is the key word in Java string split?

Add a new method named getWords that returns the words from the body of the blog post. Since we don't need to worry about special characters, let's just use the regular expression pattern \s+ (or any one or more white space character) for the parameter to the split method. (Remember to escape the backslash in your Java code.)

com/example/BlogPost.java
package com.example;

import java.util.Date;

public class BlogPost {
  private String mAuthor;
  private String mTitle;
  private String mBody;
  private String mCategory;
  private Date mCreationDate;

  public BlogPost(String author, String title, String body, String category, Date creationDate) {
    mAuthor = author;
    mTitle = title;
    mBody = body;
    mCategory = category;
    mCreationDate = creationDate;
  }

  public String getAuthor() {
    return mAuthor;
  }

  public String getTitle() {
    return mTitle;
  }

  public String getBody() {
    return mBody;
  }

  public String getCategory() {
    return mCategory;
  }

  public Date getCreationDate() {
    return mCreationDate;
  }

  public String getWords() {
    return mBody.split("[^\s]+");
  }
}

1 Answer

Hi there,

You want to escape the backslash with another backslash. And the method returns an array of strings, not a string. We are splitting a long blog post into individual words and putting them in an array. So, return String[].

That looks like:

  public String[] getWords(){
    return mBody.split("\\s+");
  }

I hope that helps,

Steve.

Xuanzheng Lin
Xuanzheng Lin
2,466 Points

Hi, Steve,

Thanks again for helping out. However, why use \s to escape instead of \w shown in the lecture?

Have a good one : )

Hi there!

No problem for the help - glad to be of assistance!

The regular expression here is finding the whitespace and splitting the string on that whitespace. I think the expression for one character of whitespace is \s but here we're using a quantifier, the +, which means it splits on multiple whitespace characters. The preceding backslash is just to escape the backslash in the regexp.

If the video used \w, from memory, that's a word character which is A-Za-z0-9_.

I'll check the video to see whether that makes sense!

Steve.

Yes, Craig uses \w to highlight characters that belong in a word, thus excluding the question mark, hashtag etc. He then negates that, using [^\w]. Incidentally, you can use \W to do the same thing - I don't know if the video covered that too.

I used \s+ because the question said to do that! \s+ is not the same as [^\w] as \s+ just splits on whitespace, i.e. spaces, tabs & newlines whereas [^\w] splits on non-word characters, and drops them. So, in the challenge, if the title of the book was "Stop! Police!", using \s+ would give us a two element array, ["Stop!", "Police!"] whereas using [^\w] would give us ["Stop", "Police"].

I think. :smile:

I hope that's useful.

Steve.

Xuanzheng Lin
Xuanzheng Lin
2,466 Points

Hi, Steve,

Understood. Really appreciate that! One last question regarding split, why you said "escape the backslash with another backslash"?

Yes, an escape character is used to avoid the literal meaning of a character in Java (and many other languages). Take the double inverted commas, for example. To print those inside a string, you need to escape them so they are treated as part of the string, rather than the end of it. This is what the backslash character does. "This string uses \"quotes\" inside a String object". Here the "" symbols are escaped with the backslash.

This is fine but how do we use a backslash in a regex if Java thinks it is escaping the next character? Say you don't want to add a newline but do need to output \n? You escape the escape character!! So, we end up with a double backslash; "This adds a newline \n but this just prints backslash n \\n"

I hope that made sense!!

Steve.

Xuanzheng Lin
Xuanzheng Lin
2,466 Points

Wow, Steve. Really thank you so much. U r damn good!

Have a good one : )

I've done a lot of courses! :smile:

Xuanzheng Lin
Xuanzheng Lin
2,466 Points

Haha you definitely did!