Accessing first 10 key-value pairs in nested dictionary, creating a new nested dictionary with them.

Question

I am at the end of a long data science assignment using python/pandas. My last step is to take a nested dictionary of the words spoken by the 4 major characters on Seinfeld and produce a new dictionary of the top 10. I've sorted, removed stop words and all that fun stuff. I just need a new version of the nested dictionary. I've gotten this far:

for k1 in anotherDict:
    for k2, v2 in v1:
        take(10, anotherDict[k1][k2])
print(*anotherDict.items())

which produces this unpleasantness ('ELAINE', <itertools.islice object at 0x0000013711892EA8>) ('GEORGE', <itertools.islice object at 0x00000137118FD4A8>) ('JERRY', <itertools.islice object at 0x00000137118FD868>) ('KRAMER', <itertools.islice object at 0x00000137118FD0E8>)

I am still struggling to become competent, but I also know that those hex expressions may be covering...a wrong result. So, if someone would help me figure out 1) If my strategy is sound and 2) how to reveal the contents of the objects, I would be extremely grateful. Thanks.

Answer 1 · 2021-04-11T18:52:43Z

April 11, 2021 6:52pm

I suspect it would work to print out the inner dictionary. I am not sure how I could limit it to the top ten (the first 10 key-values) as they are sorted descending. I'll try to adapt it tomorrow.

Answer 2 · 2021-04-12T13:43:42Z

April 12, 2021 1:43pm

Hi Brandon and Chris: The suggestions aren't working - neither gets rid of the "islice object problem" and if I add the little unpack asterisk, the interpreter complains. I've tried a new strategy to see if I can avoid iterables - just seeing if I can assign the first 10 values for each character (the first 10 items of their individual word dictionaries) to a new, shortened version.

newd = dict()
for k1 in anotherDict:
    for k2 in range(10):  #why isn't it possible to just get the first 10 key-value pairs?
        newd = anotherDict.items()
print(dict(newd))

It runs, but the inner dictionary remains the same length. I have a feeling some version of this might work, if I could just restrict it to the first 10 key-value pairs in each inner dictionary.

Thanks for responding. I've still got a week to the deadline but as this is the most challenging thing I've done using notebooks, I am trying to get out ahead of it.

Answer 3 · 2021-04-12T23:19:31Z

April 12, 2021 11:19pm

What I did:

k1lst = list(anotherDict.keys())
print(k1lst) # to check contents
vallist = []
n = 10
top_ten = {}
for k1 in anotherDict:
    first_n_pairs = list(anotherDict[k1].items())[:10]
    vallist.append(first_n_pairs)    

print(vallist) #to check contents

top_ten_dict = dict(zip(k1lst, vallist)
print(top_ten_dict)

"""Result: {'ELAINE': [('I', 2604), ('You', 598), ("I'm", 495), ('Oh,', 490), ('What', 398), ('know', 329), ('Well,', 303), ('like', 282), ('get', 269), ('got', 261)], 'GEORGE': [('I', 3946), ('You', 879), ("I'm", 823), ('like', 543), ('What', 520), ('know', 479), ('get', 460), ("It's", 415), ('think', 371), ('got', 367)], 'JERRY': [('I', 4665), ('You', 1188), ("I'm", 884), ('What', 799), ('like', 751), ('know', 693), ('get', 682), ('Oh,', 586), ('Well,', 557), ('it.', 517)], 'KRAMER': [('I', 2155), ("I'm", 551), ('You', 516), ('Well,', 512), ('Yeah,', 396), ('Oh,', 391), ('got', 384), ('get', 313), ('know', 290), ('like', 278)]}'''

It's close enough and I have a week to figure out how to make the tuples into key-value pairs. It does what was asked, though, those tuples are the 10 most common words spoken by the characters and the count for those (minus stop words)

Thank you for responding to me. I appreciated your time.

Answer 4 · 2021-04-13T00:08:22Z

April 13, 2021 12:08am

Hi Chris,

the point was (as I understood it) to take 2 columns of a dataframe (characters and their lines), and create a nested dictionary of the 4 main characters and the top ten words that they spoke (removing the stop words). I pretty much had everything done except the top 10 part (I deconstructed the dataframe columns, created the first version of the nested dictionary, sorted it, removed the stop words.) I was struggling with extracting the top ten for each character into a new dictionary. I finally succeeded at it as explained above, although the top ten words and their counts are presented in tuples. I will keep working on that, although the dictionary remains nested and I wasn't forbidden from using tuples in tne inner dictionary.

Thanks!

Nancy Melucci

Answer 5 · 2021-04-13T00:13:38Z

April 13, 2021 12:13am

Thank you. I will try that tomorrow AM, when I will have more energy for it and can make sure it stays nested. Stay well and safe whereever you are on the planet.

Answer 6 · 2021-04-13T01:53:06Z

April 13, 2021 1:53am

With a better understanding of your data structure, here's how I would approach it:

# sample data
anotherDict = {
    "abe": {"ab1": 10, "de1": 11, "gh1": 12, "jk1": 13},
    "bob": {"bc2": 20, "de2": 21, "gh2": 22, "jk2": 23},
    "cil": {"ab3": 30, "de3": 31, "gh3": 32, "jk3": 33},
    "dan": {"ab4": 40, "de4": 41, "gh4": 42, "jk4": 43},    
    }

top_n_dicts = {}
top_count = 2

for name, words in anotherDict.items():
    # sort by count value,
    # -x[1] causes highest value to be first
    sorted_by_count = sorted([(word, count)
                              for word, count
                              in words.items()],
                             key=lambda x:-x[1])
    top_words = sorted_by_count[:top_count]
    # add as new dict
    top_n_dicts[name] = dict(top_words)

print(anotherDict)
print(top_n_dicts)

Post back if you have any questions. Good Luck (from Portland, OR)!!

Welcome to the Treehouse Community

Looking to learn something new?

Nancy Melucci

Nancy Melucci

Accessing first 10 key-value pairs in nested dictionary, creating a new nested dictionary with them.

Brandon White

Brandon White

Chris Freeman

Chris Freeman

Chris Freeman

Chris Freeman

6 Answers

Nancy Melucci

Nancy Melucci

Nancy Melucci

Nancy Melucci

Nancy Melucci

Nancy Melucci

Chris Freeman

Chris Freeman

Nancy Melucci

Nancy Melucci

Nancy Melucci

Nancy Melucci

Chris Freeman

Chris Freeman