Mengenai Saya

Nama Saya Alfahri Rabbi Ramadhan TTL Bekasi,22 Desember 2000

Jumat, 08 April 2022

Jaccard

1. Jika diketahui A={1,2,3,4, B ={1,2,4} dan C = {1,2,4,5}, berapakah Jaccard (A,B), Jaccard (B,C) dan Jaccard (A,C)? 

Jawab :

Berikut adalah langkah - langkah dari menghitung Jaccard diatas!

Jaccard(A,B)

 | A | = 4

 | B | = 3

 | A ∩ B | = 3

 | A U B | = (|A| + |B| - | A ∩ B |) = 4 + 3 – 3 = 4 

Jadi Jaccard(A,B) = | A ∩ B | / | A U B | = 3/4 = 0.75 


Jaccard(B,C) 

| B | = 3

| C | = 4

| B ∩ C | = 3

| B U C | = (|B| + |C| - | B ∩ C |) = 3 + 4 – 3 = 4 

Jadi Jaccard(B,C) = | B ∩ C | / | B U C | = 3/4 = 0.75


Jaccard(A,C) 

| A | = 4

| C | = 4

| A ∩ C | = 3 

| A U C | = (|A| + |C| - | A ∩ C |) = 4 + 4 – 3 = 5 

Jadi Jaccard(A,C) = | A ∩ C| / | A U C | = 3/5 = 0.6


2. Berikutnya untuk kasus query dan document. Misalnya kita punya :

query : ideas of march

doc 1 : caesar died in march

doc 2 : the long march 

Cari Koefisien Jaccard antara query dengan doc1 dan doc2.

Jawab

Jaccard(Q, doc1)

Q = 3 

DOC1 = 4 

| Q ∩ doc1 | = 1 

| Q U doc1 | = 6 

|Q ∩ doc1| / | Q U doc1| = 1/6 = 0.17

 Jaccard(Q, doc2) 

Q = 3 

doc2 = 3 

| Q ∩ doc2 | = 1 

| Q U doc2 | = 5 

| Q ∩ doc2 | / |Q U doc2| = 1/5 = 0.2

 

3. Diketahui 3 dokumen :

d1 : "Jack London traveled to Oakland"

d2 : "Jack London traveled to the city of Oakland"

d3 : "Jack traveled from Oakland to London"

Nilai dari Koefisien Jaccard J(d1,d2) dan J(d1,d3) jika dilakukan dengan n-gram analisis dengan n=2 (bigram) adalah :

Jawab

Jaccard(d1,d2) 

d1 = 4 (Jack London, London traveled, traveled to, to Oakland) 

d2 = 7 (Jack London, London traveled, traveled to, to the, the city, city of, of Oakland) 

| d1 ∩ d2 | = 3 

| d1 U d2 | = 8 

| d1 ∩ d2 | / | d1 U d2 | = 3/8 = 0.375 

Jaccard(d1,d3) 

d1 = 4(Jack London, London traveled, traveled to, to Oakland) 

d2 = 5(Jack traveled, traveled from, from Oakland, Oakland to, to London 

| d1 ∩ d3 | = 0 

| d1 U d2 | = 9 

| d1 ∩ d2 | / | d1 U d2 | = 0/9 = 0

Tidak ada komentar:

Posting Komentar