1. Jika diketahui A={1,2,3,4, B ={1,2,4} dan C = {1,2,4,5}, berapakah Jaccard (A,B), Jaccard (B,C) dan Jaccard (A,C)?
Jawab :
Berikut adalah langkah - langkah dari menghitung Jaccard diatas!
Jaccard(A,B)
|
A | = 4
|
B | = 3
|
A ∩ B | = 3
|
A U B | = (|A| + |B| - | A ∩ B |) = 4 + 3 – 3 = 4
Jadi Jaccard(A,B)
= | A ∩ B | / | A U B | = 3/4 = 0.75
Jaccard(B,C)
|
B | = 3
|
C | = 4
|
B ∩ C | = 3
|
B U C | = (|B| + |C| - | B ∩ C |) = 3 + 4 – 3 = 4
Jadi Jaccard(B,C)
= | B ∩ C | / | B U C | = 3/4 = 0.75
Jaccard(A,C)
|
A | = 4
|
C | = 4
|
A ∩ C | = 3
|
A U C | = (|A| + |C| - | A ∩ C |) = 4 + 4 – 3 = 5
Jadi Jaccard(A,C)
= | A ∩ C| / | A U C | = 3/5 = 0.6
2.
Berikutnya untuk kasus query dan document. Misalnya kita punya :
query
: ideas of march
doc
1 : caesar died in march
doc
2 : the long march
Cari Koefisien Jaccard antara query dengan doc1 dan doc2.
Jawab
Jaccard(Q, doc1)
Q
= 3
DOC1
= 4
|
Q ∩ doc1 | = 1
|
Q U doc1 | = 6
|Q
∩ doc1| / | Q U doc1| = 1/6 = 0.17
Jaccard(Q, doc2)
Q
= 3
doc2
= 3
|
Q ∩ doc2 | = 1
|
Q U doc2 | = 5
|
Q ∩ doc2 | / |Q U doc2| = 1/5 = 0.2
3.
Diketahui 3 dokumen :
d1
: "Jack London traveled to Oakland"
d2
: "Jack London traveled to the city of Oakland"
d3
: "Jack traveled from Oakland to London"
Nilai dari Koefisien Jaccard J(d1,d2) dan J(d1,d3) jika dilakukan dengan n-gram analisis dengan n=2 (bigram) adalah :
Jawab
Jaccard(d1,d2)
d1
= 4 (Jack London, London traveled, traveled to, to Oakland)
d2
= 7 (Jack London, London traveled, traveled to, to the, the city, city of, of
Oakland)
| d1 ∩ d2 | = 3
| d1 U d2 | = 8
| d1 ∩ d2 | / | d1 U d2 | = 3/8 = 0.375
Jaccard(d1,d3)
d1
= 4(Jack London, London traveled, traveled to, to Oakland)
d2
= 5(Jack traveled, traveled from, from Oakland, Oakland to, to London
| d1 ∩ d3 | = 0
| d1 U d2 | = 9
| d1 ∩ d2 | / | d1 U d2 | = 0/9 = 0
Tidak ada komentar:
Posting Komentar