Welcome to the Treehouse Community
Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.
Looking to learn something new?
Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.
Start your free trialNancy Melucci
Courses Plus Student 36,143 PointsAccessing first 10 key-value pairs in nested dictionary, creating a new nested dictionary with them.
I am at the end of a long data science assignment using python/pandas. My last step is to take a nested dictionary of the words spoken by the 4 major characters on Seinfeld and produce a new dictionary of the top 10. I've sorted, removed stop words and all that fun stuff. I just need a new version of the nested dictionary. I've gotten this far:
for k1 in anotherDict:
for k2, v2 in v1:
take(10, anotherDict[k1][k2])
print(*anotherDict.items())
which produces this unpleasantness ('ELAINE', <itertools.islice object at 0x0000013711892EA8>) ('GEORGE', <itertools.islice object at 0x00000137118FD4A8>) ('JERRY', <itertools.islice object at 0x00000137118FD868>) ('KRAMER', <itertools.islice object at 0x00000137118FD0E8>)
I am still struggling to become competent, but I also know that those hex expressions may be covering...a wrong result. So, if someone would help me figure out 1) If my strategy is sound and 2) how to reveal the contents of the objects, I would be extremely grateful. Thanks.
Chris Freeman
Treehouse Moderator 68,454 PointsSince each islice
object is an iterable, use list(x)
instead of x in Brandon White’s answer:
print([name, list(islce)
for name, islce
in anotherDict.items()])
Chris Freeman
Treehouse Moderator 68,454 PointsStarting with Python 3.6, dictionary key order is the same as insertion order. So I’m unclear what you mean by the “first 10 keys”. These would be the first 10 to be inserted. Maybe if you sorted the keys, then took the first n values, you could produce a new dict. Below creates a new dict from the first 2 (of 4) keys:
>>> d = {'a': 1, 'c': 2, 'b': 3, 'd': 4}
>>> d
{'a': 1, 'c': 2, 'b': 3, 'd': 4}
>>> dict([(key, d[key]) for key in sorted(d.keys())][:2])
{‘a’: 1, ‘b’: 3}
6 Answers
Nancy Melucci
Courses Plus Student 36,143 PointsI suspect it would work to print out the inner dictionary. I am not sure how I could limit it to the top ten (the first 10 key-values) as they are sorted descending. I'll try to adapt it tomorrow.
Nancy Melucci
Courses Plus Student 36,143 PointsHi Brandon and Chris: The suggestions aren't working - neither gets rid of the "islice object problem" and if I add the little unpack asterisk, the interpreter complains. I've tried a new strategy to see if I can avoid iterables - just seeing if I can assign the first 10 values for each character (the first 10 items of their individual word dictionaries) to a new, shortened version.
newd = dict()
for k1 in anotherDict:
for k2 in range(10): #why isn't it possible to just get the first 10 key-value pairs?
newd = anotherDict.items()
print(dict(newd))
It runs, but the inner dictionary remains the same length. I have a feeling some version of this might work, if I could just restrict it to the first 10 key-value pairs in each inner dictionary.
Thanks for responding. I've still got a week to the deadline but as this is the most challenging thing I've done using notebooks, I am trying to get out ahead of it.
Nancy Melucci
Courses Plus Student 36,143 PointsWhat I did:
k1lst = list(anotherDict.keys())
print(k1lst) # to check contents
vallist = []
n = 10
top_ten = {}
for k1 in anotherDict:
first_n_pairs = list(anotherDict[k1].items())[:10]
vallist.append(first_n_pairs)
print(vallist) #to check contents
top_ten_dict = dict(zip(k1lst, vallist)
print(top_ten_dict)
"""Result: {'ELAINE': [('I', 2604), ('You', 598), ("I'm", 495), ('Oh,', 490), ('What', 398), ('know', 329), ('Well,', 303), ('like', 282), ('get', 269), ('got', 261)], 'GEORGE': [('I', 3946), ('You', 879), ("I'm", 823), ('like', 543), ('What', 520), ('know', 479), ('get', 460), ("It's", 415), ('think', 371), ('got', 367)], 'JERRY': [('I', 4665), ('You', 1188), ("I'm", 884), ('What', 799), ('like', 751), ('know', 693), ('get', 682), ('Oh,', 586), ('Well,', 557), ('it.', 517)], 'KRAMER': [('I', 2155), ("I'm", 551), ('You', 516), ('Well,', 512), ('Yeah,', 396), ('Oh,', 391), ('got', 384), ('get', 313), ('know', 290), ('like', 278)]}'''
It's close enough and I have a week to figure out how to make the tuples into key-value pairs. It does what was asked, though, those tuples are the 10 most common words spoken by the characters and the count for those (minus stop words)
Thank you for responding to me. I appreciated your time.
Chris Freeman
Treehouse Moderator 68,454 PointsIf you have a list of tuples, you can create a new dictionary from them using:
dict(list_of_tuples)
See dict() in docs for various ways to make a new dict from other data.
Nancy Melucci
Courses Plus Student 36,143 PointsHi Chris,
the point was (as I understood it) to take 2 columns of a dataframe (characters and their lines), and create a nested dictionary of the 4 main characters and the top ten words that they spoke (removing the stop words). I pretty much had everything done except the top 10 part (I deconstructed the dataframe columns, created the first version of the nested dictionary, sorted it, removed the stop words.) I was struggling with extracting the top ten for each character into a new dictionary. I finally succeeded at it as explained above, although the top ten words and their counts are presented in tuples. I will keep working on that, although the dictionary remains nested and I wasn't forbidden from using tuples in tne inner dictionary.
Thanks!
Nancy Melucci
Nancy Melucci
Courses Plus Student 36,143 PointsThank you. I will try that tomorrow AM, when I will have more energy for it and can make sure it stays nested. Stay well and safe whereever you are on the planet.
Chris Freeman
Treehouse Moderator 68,454 PointsWith a better understanding of your data structure, here's how I would approach it:
# sample data
anotherDict = {
"abe": {"ab1": 10, "de1": 11, "gh1": 12, "jk1": 13},
"bob": {"bc2": 20, "de2": 21, "gh2": 22, "jk2": 23},
"cil": {"ab3": 30, "de3": 31, "gh3": 32, "jk3": 33},
"dan": {"ab4": 40, "de4": 41, "gh4": 42, "jk4": 43},
}
top_n_dicts = {}
top_count = 2
for name, words in anotherDict.items():
# sort by count value,
# -x[1] causes highest value to be first
sorted_by_count = sorted([(word, count)
for word, count
in words.items()],
key=lambda x:-x[1])
top_words = sorted_by_count[:top_count]
# add as new dict
top_n_dicts[name] = dict(top_words)
print(anotherDict)
print(top_n_dicts)
Post back if you have any questions. Good Luck (from Portland, OR)!!
Brandon White
Full Stack JavaScript Techdegree Graduate 35,771 PointsBrandon White
Full Stack JavaScript Techdegree Graduate 35,771 PointsHi Nancy,
Maybe you could try a list comprehension. print([x for x in anotherDict.items()]).
I don’t know that, that would work. I’m just kinda throwing something out there. If that doesn’t help, then I’ll do a little research once I’m near my computer.