Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialAditya Puri
1,080 PointsProblem about [\\w#@'] +
I do understand what the split method is but I can't understand what the line [\w#@'] + means. Can anyone please explain this line character to character to me?
Also what is the + doing there?
2 Answers
Alexander Nikiforov
Java Web Development Techdegree Graduate 22,175 PointsQuote from teacher's notes:
[^\w#@']+ (Matches one or more character that is not in word based characters, #, @ and apostrophe)
Please check resources on the web... And play around... It comes with experience... I will try my best, but you have to dig on your own, trying to apply everywhere:
OK. Here is my best explanation
I. Brackets []
mean symbol. Take example from https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
When you write [a] it means only one symbol 'a' will work, more symbols or other symbols will not work, please try to play on the website that Craig suggests online Regex Tester.
When you write [ab] - it means it can be 'a' or 'b' -only one symbol
So coming back to Regex you want: Take a look at []
without PLUS symbol. With brackets we define a symbol
Lets look inside brackets:
^
character means except. Examples:
[^a]
means any character but not 'a', so 'b', 'c', 'd' and all symbols that are not 'a' will work
Lets get back to [^\w]
\w
is the same as [a-zA-Z_0-9]
which means any character from 'a' to 'z' or 'A' to 'Z' and from 0 to 9.
So [^\w]
means not 'a', 'z', not 'A-Z', not 0-9. It could be '+', '-', '=', and others ...
When we write:
[^\w#@]
we write characters that want to exclude, which means not 'a-z', not 'A-Z', not 0-9, not '#' and not '@'
Now we come to plus at the end, plus is nothing else but one or more times, which means that can be combinations of the symbols that we don't want.
Craig wants to exclude punctuation signs, which means that he wants to exclude '?', '=', ',' and others... and with PLUS multiple combinations of them, like many trailing spaces...
I still strongly suggest to watch workshop:
https://teamtreehouse.com/library/regular-expressions-in-java
And read documentation:
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
It comes with experience. Try to play with online regex tester
Alexander Nikiforov
Java Web Development Techdegree Graduate 22,175 PointsPlease take a look for a nice workshop with Craig:
https://teamtreehouse.com/library/regular-expressions-in-java
That should clarify a lot, also make sure you check Teacher's notes, he puts there a lot of external resources as well
Aditya Puri
1,080 PointsI still don't understand..please explain character by character...Dennis has done a really vague explanation
Aditya Puri
1,080 PointsAditya Puri
1,080 Pointsstill don't understand the '+' thing..I did try to read the docs and watch the workshop but again, I couldn't understand :(
Alexander Nikiforov
Java Web Development Techdegree Graduate 22,175 PointsAlexander Nikiforov
Java Web Development Techdegree Graduate 22,175 Points[a]
means only one symbol, 'a' will pass[a]+
means one or more 'a' symbols, so that 'a', 'aa', 'aaa', 'aaaa' and any amount of 'a' letters will pass[\w]
- means one word symbol and the same as [a-zA-Z0-9], so 'a', 'b', '9', 'Z' and etc will pass[\w]+
-means one or many combination of word characters will pass, so 'a', 'ab', 'aZ', 'aZ1', 'cbdceAQWE123' will also passWell, I don't know how to explain more ...
Aditya Puri
1,080 PointsAditya Puri
1,080 Pointsoh.. but why does he put in 2 // in his code instead of one / ?
Alexander Nikiforov
Java Web Development Techdegree Graduate 22,175 PointsAlexander Nikiforov
Java Web Development Techdegree Graduate 22,175 PointsIt is just the way to pass Java correct Regex.
Actual regex have one slash:
[\w]+
But when you want to put slash it in Java code, you have to escape it. And the way it is escaped is using slash, that is why when you write
split("[\\w]+")
you actually pass correct regexp[\w]
, because\\
is interpreted by java compiler as slash ...Try to read here, for example:
http://www.tutorialspoint.com/java/java_characters.htm
If you write
split("[\w]+")
you will pass regexp[w]
.Looks strange but that is just Java rules.
In Java slash is used as special character. When compiler finds it, he reads the next symbol right after it and not the slash itself ... If the symbol after slash is 'n', then it will be newline, so
\n
is transformed to newline.Hope it does make sense .. Escape characters are use in many languages and slash is the common way to implement and write these characters : newline, backspace, tab and etc.