2023年6月14日 星期三

字數統計

d = '''

Full Transcript: NVIDIA CEO Jensen Huang's Commencement Address at National Taiwan University

Ladies and gentlemen, esteemed faculty members, distinguished guests, proud parents, and above all, the 2023 graduating class of the National Taiwan University. Today is a very special day for you, and a dream come true for your parents.

...

...
 You will endure pain and suffering needed to realize your dreams. And you will make sacrifices to dedicate yourself to a life of purpose and doing your life's work.

Class of 2023, I extend my heartfelt congratulations to each one of you.  Jiayou!

'''

d = d.replace(':',' ').replace('.',' ').replace('\'',' ').replace(',',' ')

d1= [ i for i in d.split() if i> '9' and len(i)>1]


s = set(d1)

r = []

for i in s:

    r.append( [d1.count(i),i])

r.sort()

r.reverse()

#for i in r[:100]:

#    print(i[0],':',i[1])

#

#r100 = r[0:100]

#s100 =0

#for i in r100:

#    s100 = s100 + i[0]

#p = s100/len(d1)*100

#p = int(p*100+0.5)/100

#print('會前', len(r100),'個字, 就可看懂整篇1781字文章的',p,'%')

#

#r200 = r[0:200]

#s200 =0

#for i in r200:

#    s200 = s200 + i[0]

#p = s200/len(d1)*100

#p = int(p*100+0.5)/100

#print('會前', len(r200),'個字, 就可看懂整篇1781字文章的',p,'%')

#

#r300 = r[0:300]

#s300 =0

#for i in r300:

#    s300 = s300 + i[0]

#p = s300/len(d1)*100




#p = int(p*100+0.5)/100

#print('會前', len(r300),'個字, 就可看懂整篇1781字文章的',p,'%')

for n in range(100,800,100):

    rr = r[0:n]

    s =0

    for i in rr:

        s = s + i[0]

    p = s/len(d1)*100

    p = int(p*100+0.5)/100

    print('會前', len(rr),'個字, 就可看懂整篇1781字文章的',p,'%')


#執行結果

#會前 100 個字, 就可看懂整篇1781字文章的 53.23 %

#會前 200 個字, 就可看懂整篇1781字文章的 67.66 %

#會前 300 個字, 就可看懂整篇1781字文章的 76.59 %

#會前 400 個字, 就可看懂整篇1781字文章的 82.2 %

#會前 500 個字, 就可看懂整篇1781字文章的 87.82 %

#會前 600 個字, 就可看懂整篇1781字文章的 93.43 %

#會前 700 個字, 就可看懂整篇1781字文章的 99.05 %