Given a collection D of documents. For any keyword (or index term) w, the document frequency of wis the number of docume
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am
Given a collection D of documents. For any keyword (or index term) w, the document frequency of wis the number of docume
Given a collection D of documents. For any keyword (or index term) w, the document frequency of wis the number of documents in D that contain w. We sort all keywords in decreasing order of their document frequencies. Let w denote the rank, i.e., the position of w in the sorted list. Assume that we have the following Zipf's Law: А dfw w Here, A is constant. Suppose that there are N distinct keywords. Under the above Zipf's Law, what is the size of the inverted indices for D? Note: You shall estimate the total number of nodes in the inverted indices.