Skip to content

Scripts and Code for leveraging Oracle APEX and Oracle Text together.

Notifications You must be signed in to change notification settings

chipbaber/apex_textdemo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Oracle APEX Text Demonstration

This script will outline the steps required to leverage Oracle Text inside and Autonomous Database. Oracle APEX will be reference and leveraged for the creation of a front end application to upload and interact with the core code.

Video References

To save time it may be beneficial to watch the following video's related to this content.

Create a new workspace in Oracle APEX

  • In this example a workspace with the name searchdemo was created. As a part of this creation a database user named search demo was also created.

  • Login to your new APEX workshop and create a new user. In this example the workspace user is named cbaber.

  • Login to your workspace as your new user.

Add grants for Oracle Text to the schema.

  • Navigate to SQL Web Developer and open a session as admin.

  • Grant the following to searchdemo

grant ctxapp to searchdemo;
  • Query to check and see if the priv is granted. This must be executed from SQL workshop that is part of ATP as admin.
SELECT * FROM DBA_ROLE_PRIVS WHERE GRANTEE = 'searchdemo';

Create your table to store the resume.

  • this was done with APEX wizard code below
CREATE table "RESUME" (
    "DOC_ID"       NUMBER NOT NULL,
    "TITLE"        VARCHAR2(500),
    "SUBMITTED_BY" VARCHAR2(300),
    "RESUME"       BLOB,
    "MIMETYPE"     VARCHAR2(250),
    "CREATED_DATE" DATE,
    "FILENAME"     VARCHAR2(200),
    constraint  "RESUME_PK" primary key ("DOC_ID")
)
/

CREATE sequence "RESUME_SEQ"
/

CREATE trigger "BI_RESUME"  
  before insert on "RESUME"              
  for each row
begin  
  if :NEW."DOC_ID" is null then
    select "RESUME_SEQ".nextval into :NEW."DOC_ID" from sys.dual;
  end if;
end;
/   

Create a form to upload a document to the table.

Build index's on the table for Oracle Text.

  • Go to SQL Workshop, SQL commands set to pl/sql and execute the following to create a index. The CTXSYS.AUTO_FILTER must be used for structured docs like PDF and MSword.
CREATE INDEX searchMyDocs ON resume(resume) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS
('DATASTORE CTXSYS.DEFAULT_DATASTORE FILTER CTXSYS.AUTO_FILTER FORMAT COLUMN MIMETYPE');
  • Syntax to sync index every minute
CREATE INDEX searchMyDocs ON resume(resume) INDEXTYPE IS CTXSYS.CONTEXT PARAMETERS
('DATASTORE CTXSYS.DEFAULT_DATASTORE FILTER CTXSYS.AUTO_FILTER FORMAT COLUMN MIMETYPE sync (every "freq=secondly;interval=60")')

Basic queries of the indexed documentation.

  • Core query inside the document to see if contains a keyword. In this first example we are looking for all candidates who have the word java inside there resume. The greater than zero means a score is detected, the information is in the document.
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, 'Java', 1) > 0;
  • Find all the documents that have the word java with a score >11 and contain the word nascar.
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, '(java > 10) and nascar', 1) > 0;
  • Proximity search look for the word Eloqua near word code (Requires Chip resume)
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, 'Eloqua ; code', 1) > 0;
  • Second example of proximity search with near operator. Looking for development near creativity within 5 words
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume
WHERE CONTAINS(resume, 'near((development, creativity), 5)', 1) > 0;
  • Fuzzy Search on term.
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, '?jav', 1) > 0;
  • Sounds like word - Soundex search
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, '!ava', 1) > 0;
  • Stem search on a term. In this example we are looking for the documents that stem of the word work example would return words like workflow,
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, '$work', 1) > 0;
  • Accumulation Search - Best to read this to understand Accum docs
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, 'java ACCUM present', 1) > 0;
SELECT SCORE(1), doc_id, title, submitted_by  FROM resume WHERE CONTAINS(resume, 'java ACCUM present*3', 1) > 0;
  • Add More Docs in the apex app. To sync the index execute in this case with 5M of memory
begin
CTX_DDL.SYNC_INDEX('searchMyDocs', '5M');
end;

Construct Thematic Search Constructs

  • Build themes on docs. To do this you first need to create a table to hold your themes.
create table themes (query_id number, theme varchar2(2000), weight number);
  • Create all the themes on the docs in the table. Run this in pl/sql mode in SQL Workshop. This procedure will loop through the resume table and create themes for all the docs currently in the table.
begin
	for x in (select doc_id from resume) loop
    ctx_doc.themes ('searchMyDocs', x.doc_id, 'themes', x.doc_id, full_themes => false);
end loop;
end;
  • Query the themes. Show all themes with a weight over 25 per resume.
select r.title, r.filename, t.theme, t.weight from resume r, themes t
where r.doc_id = t.query_id and weight > 25
order by doc_id asc;
  • If you need to add more documents you need to rebuild the themes. Make sure to truncate the table before rebuilding.
declare
pragma autonomous_transaction;
killThemes	varchar2(100) := 'truncate table themes';

begin
execute immediate killThemes;
	for x in (select doc_id from resume) loop
    ctx_doc.themes ('searchMyDocs', x.doc_id, 'themes', x.doc_id, full_themes => false);
end loop;
end;

Construct Gists Search Constructs

  • Create the gists table
create table gists (query_id  number, pov  varchar2(80), gist  CLOB);
  • Build out gists index.
begin
	for x in (select doc_id from resume) loop
    ctx_doc.gist('searchMyDocs', x.doc_id, 'gists',x.doc_id,'P', pov =>'GENERIC');
end loop;
end;
  • Query a gist for a document.
select * from gists;
  • Rebuild the gists index
declare
pragma autonomous_transaction;
killGists	varchar2(100) := 'truncate table gists';

begin
execute immediate killGists;
	for x in (select doc_id from resume) loop
    ctx_doc.gist('searchMyDocs', x.doc_id, 'gists',x.doc_id,'P', pov =>'GENERIC');
end loop;
end;

Create a filtered doc

  • Create a Table for markup.
create table filtered_docs(QUERY_ID  	number,    DOCUMENT  	clob);
  • Build out filtered docs index's.
begin
	for x in (select doc_id from resume) loop
    	ctx_doc.filter ('searchMyDocs', x.doc_id, 'filtered_docs', x.doc_id, plaintext => true);
end loop;
end;                      
  • Query the filtered doc. Showing the title and the filtered text.
select r.title, f.document as "Plain Text Resume" from resume r, filtered_docs f
where r.doc_id = f.query_id

Create Full Themes Indexing

  • Create the table for full themes. A full theme has both the theme and any relations to other themes.
create table full_themes( QUERY_ID	number, THEME		varchar2(2000),  WEIGHT		NUMBER);
  • Create indexs for full themes.
begin
	for x in (select doc_id from resume) loop
    	ctx_doc.themes ('searchMyDocs', x.doc_id, 'full_themes', x.doc_id, full_themes => true);
end loop;
end;  
  • Query the full themes. Truth is you need a bit of a tool to visualize thematic connections. So this one works but is tough to convey to users.
select r.title, r.filename, t.theme, t.weight from resume r, full_themes t
where r.doc_id = t.query_id
order by doc_id asc;

Scripted Rebuilds

  • If you need to rebuild themes, gists ... after loading more documents execute the code in resumeAdmin.sql with the procedures you require. Then call the procedures. Example below.
begin
CTX_DDL.SYNC_INDEX('searchMyDocs', '5M');
resumeAdmin.Batch_Create_Themes();
resumeAdmin.Batch_Create_Full_Themes();
resumeAdmin.Batch_Create_Gists();
resumeAdmin.Batch_Create_Filtered_Docs();
end;

About

Scripts and Code for leveraging Oracle APEX and Oracle Text together.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published