Question 1 2.0/2.0 points (graded) In attempting to derive the cocitation matrix, your friend came up with the following algorithm: Assuming the row indices of the matrix mean that the paper is citing others, and the column indices that the paper is being cited, then the algorithm's steps would be: . Construct an empty matrix for C. • Go through the rows of A one by one. • For each row r of A, if the row sum is strictly greater than 1, then do: for each pair ((r, a), (r, b)) in row r that are non-zero (meaning that there is an existing relationship), add 1 to C at the location (a, b). Note that by following this rule, you will naturally also add 1 to C at location (b, a) as the pair ((r, b), (r, a)) must also be present. After reading carefully through the proposed steps, please answer the following:
Does this generate the cocitation weighted adjacency matrix? Yes No What is the big-O complexity, O, of the proposed algorithm, in terms of n, the number of nodes in the graph? n.³ O(...) is n^3 Submit Question 2 5.0/5.0 points (graded) You have used 1 of 3 attempts Write the cocitation weighted adjacency matrix, C, in terms of A using matrix operations. Use A^T for AT and diagonals in your answer need not match the diagonals generated by the definition in Question 1, the off-diagonals C = A*A^T A. AT for matrix multiplication. The should match Question 1. Save
Part (b): Bibliographic coupling 5.0/5.0 points (graded) Two papers are said to be bibliographically coupled if they cite the same other papers. The edge weights in a bibliographic coupling correspond to the number of common citations between two papers. How do you compute the (weighted) adjacency matrix of the bibliographic coupling, B, from the adjacency matrix of the citation network, A? Write your answer in terms of matrix operations. B = A^T*A AT. A Submit You have used 1 of 3 attempts Part (c): (2 points) Include your answer to this question in your written report. (100 word limit.) How does the time complexity of your solution involving matrix multiplication in part (a) compare to your friend's algorithm? Part (d): (3 points) Include your answer to this question in your written report. (200 word limit.) Bibliographic coupling and cocitation can both be taken as an indicator that papers deal with related material. However, they can in practice give noticeably different results. Why? Which measure is more appropriate as an indicator for similarity between papers? Save
A citation network is a directed network where the vertices are academic papers and there is a directed edge from paper A to paper B if paper A cites paper B in its bibliography. Google Scholar performs automated citation indexing and has a useful feature that allows users to find similar papers. In the following, we analyze two approaches for measuring similarity between papers. Part (a): Co-citation network Two papers are said to be cocited if they are both cited by the same third paper. The edge weights in the cocitation network correspond to the number of cocitations. In this part, we will discover how to compute the (weighted) adjacency matrix of the cocitation network from the adjacency matrix of the citation network. • Problem setup: In order to derive the cocitation matrix, we need to derive it as a function of the original adjacency matrix. • Problem notation: If there is an edge from paper i to paper j, it means that paper i cites paper j. We will denote by A the corresponding adjacency matrix, such that Aj = 1 means there is a directed edge from i to j. Let us denote by C the cocitation network matrix. A citation network is a directed network where the vertices are academic papers and there is a directed edge from paper
-
- Site Admin
- Posts: 899603
- Joined: Mon Aug 02, 2021 8:13 am