Screenotate Mac版是Mac平台上的一款集成OCR文字识别功能的截图软件。使用Screenotate Mac版来进行屏幕截图,它就能截图文件中提取文本,然后可以将文本保存为可以轻松搜索的HTML文件。. Screenotate is a macOS tool for taking screenshots you can search: screenshots taken in Screenotate go through OCR (Optical Character Recognition) to make any text inside searchable, and they include context like the source URL, application, and window title. You can download and purchase the updated version of Screenotate at screenotate.com. Screenotate to capture URLs from Firefox. (This feature is: experimental.)-The app only considers the window containing the top-left corner of: your selected region when it looks for metadata.-Though Screenotate can detect the window.title. for any window (active or inactive), its URL detection is. Sep 22, 2020 Screenotate lets you take screenshots with ease. The tool uses OCR (Optical Character Recognition) to recognize text in your screenshots. Screenotate records useful metadata, not just text. It can get the title of the window, the originating URL, the time the screenshot was taken, and more.
TabFS is a browser extension thatmounts your browser tabs as a filesystem on your computer.
Out of the box, it supports Chrome and (to a lesser extent1)Firefox and Safari, on macOS and Linux.2
(update: You can now sponsor me to help support furtherdevelopment of TabFS :-)
Each of your open tabs is mapped to a folder.
I have 3 tabs open, andthey map to 3 folders in TabFS
The files inside a tab's folder directly reflect (and can control) thestate of that tab in your browser. (TODO: update as I add more)
Going through the files inside a tab's folder. Forexample, the url.txt, text.txt, and title.txt files tell me those liveproperties of this tab
This gives you a ton of power, because now you can apply all theexisting toolson your computer that already know how to deal with files -- terminalcommands, scripting languages, point-and-click explorers, etc -- anduse them to control and communicate with your browser.
Now you don't need to code up a browser extension fromscratch everytime you want to do anything. You can write a script that talks toyour browser in, like, a melange of Python and bash, and you can saveit as a single ordinaryfile that youcan run whenever, and it's no different from scripting any other partof your computer.
table of contents
Examples of stuff you can do3
(assuming your current directory is the fs
subdirectory of the gitrepo and you have the extension running)
List the titles of all the tabs you have open
Cull tabs like any other files
Selecting and deleting a bunch of tabs in my file manager
I'm using Dired in Emacs here, but you could use whatever tools youalready feel comfortable managing your files with.
Close all Stack Overflow tabs
or (older / more explicit)
btw
(this task, removing all tabs whose titles contain some string, is alittle contrived, but it's not that unrealistic, right?)
(now.. how would you do this without TabFS? I honestly have noidea, off the top of my head. like, how do you even get the titles oftabs? how do you tell the browser to close them?)
(I looked up the APIs, and, OK, if you're already in a browserextension, in a 'background script' inside the extension, and yourextension has the tabs
permission -- this already requires you tomake 2 separate files and hop between your browser and your texteditor to set it all up! -- you can dothis:chrome.tabs.query({}, tabs => chrome.tabs.remove(tabs.filter(tab => tab.title.includes('Stack Overflow')).map(tab => tab.id)))
)
(not terrible, but look at all that upfront overhead to get it setup. and it's not all that discoverable. and what if you want to reusethis later, or plug it into some larger pipeline of tools on yourcomputer, or give it a visual interface? the jump in complexity onceyou need to communicate with anything -- possibly setting up aWebSocket, setting up handlers and a state machine -- is prettyhorrifying)
(but to be honest, I wouldn't even have conceived of this as a thing Icould do in the first place)
Save text of all tabs to a file
Evaluate JavaScript on a page / watch expressions: demo
(was evals
in linked demo, is now renamed to watches
)
Now you can cat window.scrollY
and see where you are scrolled on thepage at any time.
Could make an ad-hocdashboardaround a Web page: a bunch of terminal windows floating around yourscreen, each sitting in a loop and using cat
to monitor a differentvariable.
Get images / scripts / other resource files from page
(TODO: document better, put in screenshots)
The debugger/
subdirectoryin each tab folder has synthetic files that let you access loadedresources (in debugger/resources/
) and scripts (indebugger/scripts/
).
Images will show up as actual PNG or JPEG files, scripts as actual JSfiles, and so on. (this is experimental.)
(TODO: edit the images in place? you can already kinda edit thescripts in place)
Retrieve what's playing on YouTube Music: youtube-music-tabfs
Youtube downloader 320kbps. thanks to Junho Yeo!
Reload an extension when you edit its source code
Suppose you're working on a Chrome extension (apart from thisone). It's a pain to reload the extension (and possibly affected Webpages) every time you change its code. There's a Stack Overflowpostwith ways to automate this, but they're all sort of hacky. You needyet another extension, or you need to tack weird permissions onto yourwork-in-progress extension, and you don't just get a command you cantrigger from your editor or shell to refresh the extension.
TabFS lets you do all this in an ordinary shellscript.You don't have to write any browser-side code at all.
This script turns an extension (this one's title is 'PlaygroundizeDevTools Protocol') off, then turns it back on, then reloads any tabsthat have the relevant pages open (in this case, I decided it's tabswhose titles start with 'Chrome Dev'):
I mapped this script to Ctrl-. in my text editor, and now I just hitthat every time I want to reload my extension code.
TODO: Live edit a running Web page
edit page.html
in the tab folder. I guess it could just stompouterHTML at first, eventually could do something more sophisticated
then you can use your existing text editor! and you'll always knowthat if the file saved, then it's up to date in the browser. no flakywatcher that you're not sure if it's working
(it would be cool to have a persistent storage story herealso. I like the idea of being able to put arbitrary files anywhere inthe subtree, actually, because then you could use git and emacsautosave and stuff for free.. hmm)
TODO: Import data (JSON? XLS? JS?)
drag a JSON file foo.json
into the imports
subfolder of the taband it shows up as the object imports.foo
in JS. (modifyimports.foo
in JS and then read imports/foo.json
and you read thechanges back?)
import a plotting library or whatever the same way? draggingplotlib.js
into imports/plotlib.js
and then callingimports.plotlib()
to invoke that JS file
the browser has a lot of potential power as an interactive programmingenvironment, one where graphics come asnaturally asconsole I/O do in most programming languages. i think something thatholds it back that is underexplored is lack of ability to just.. dragfiles in and manage them with decent tools. many Web-based 'IDEs' haveto reinvent file management, etc from scratch, and it's like aseparate universe from the rest of your computer, and migratingbetween one and the other is a real pain (if you want to use somePython library to munge some data and then have a Web-basedvisualization of it, for instance, or if you want to version filesinside it, or make snapshots so you feelcomfortabletrying stuff, etc).
(what would the persistent storage story here be? localStorage? it'sinteresting because I almost want each tab to be less of acommodity,lessdisposable,since now it's the site I'm dragging stuff to and it might have somepersistent state attached. like, if I'm programming and editing stuffand saving inside a tab's folder, that tab suddenly reallymatters; Iwant it to survive as long as a normal file would, unlike most browsertabs today)
(the combination of these last 3 TODOs may be a very powerful, open,dynamic, flexible programming environment where you can bring whateverexternal tools you want to bear, everything is live in your browser,you never need to restart..)
Setup
disclaimer: this extension is an experiment. I think it's cool anduseful and provocative, and I usually leave it on, but I make nopromises about functionality or, especially, security. applicationsmay freeze, your browser may freeze, there may be ways for Web pagesto use the extension to escape and hurt your computer .. In somesense, the wholepoint of thisextension is to create a gigantic new surface area of communicationbetween stuff inside your browser and software on the rest of yourcomputer.
(The installation process is pretty involved right now. I'd like tosimplify it, but I also don't want a seamless installation processthat does a bad job of managing people's expectations. And it'simportant to me that users feelcomfortablelooking at how TabFS works -- it's pretty much just twofiles! -- and that they can mess around with it; it shouldn't be ablack box.)
Before doing anything, clone this repository:
First, install the browser extension.
Then, install the C filesystem.
1. Install the browser extension
in Chrome, Chromium, and related browsers
(including Brave and Vivaldi)
Go to the Chrome extensions page. EnableDeveloper mode (top-right corner).
Load-unpacked the extension/
folder in this repo.
Make a note of the extension ID Chrome assigns. Mine isjimpolemfaeckpjijgapgkmolankohgj
. We'll use this later.
in Safari (WIP)
See the Safariinstructions. Youshould compile the C filesystem (as below) before trying to run the extension.
in Firefox
You'll need to install as a 'temporary extension', so it'll only lastin your current FF session. (If you want to install permanently, seethisissue.)
Go to about:debugging#/runtime/this-firefox.
Load Temporary Add-on..
Choose manifest.json in the extension subfolder of this repo.
2. Install the C filesystem
First, make sure you have FUSE and FUSE headers. On Linux, for example,sudo apt install libfuse-dev
or equivalent. On macOS, getmacFUSE. (on macOS, also see this-bug -- TODO work out thebest path to explain here)
Then compile the C filesystem:
(GNU Make is required, so use gmake on FreeBSD)
Now install the native messaging host into your browser, so theextension can launch and talk to the filesystem:
Chrome, Chromium, and related browsers
Substitute the extension ID you copied earlier forjimpolemfaeckpjijgapgkmolankohgj
in the command below.
(For Chromium, say chromium
instead of chrome
. For Vivaldi, sayvivaldi
instead. For Brave, say chrome
. You can look at thecontents ofinstall.sh forthe latest on browser and OS support.)
Safari (WIP)
See the Safariinstructions.
Firefox
3. Ready!
Go back to chrome://extensions
orabout:debugging#/runtime/this-firefox
and reload the extension.
Now your browser tabs should be mounted in fs/mnt
!
Open the background page inspector to see the filesystem operationsstream in. (in Chrome, click 'background page' next to 'Inspect views'in the extension's entry in the Chrome extensions page; in Firefox,click 'Inspect')
This console is also incredibly helpful for debugging anything thatgoes wrong, which probably will happen. (If you get a generic I/Oerror at the shell when running a command on TabFS, that probablymeans that an exception happened which you can check here.)
(My OS and applications are pretty chatty. They do a lot ofoperations, even when I don't feel like I'm actually doinganything. My sense is that macOS is generally chattier than Linux.)
Design
fs/
: Native FUSE filesystem, written in Ctabfs.c
:Talks to FUSE, implements fs operations, talks to extension. Irarely have to change this file; it essentially is just a stubthat forwards everything to the browser extension.
extension/
: Browser extension, written in JSbackground.js
:The most interesting file. Defines all the synthetic files andwhat browser operations they invoke behind the scenes.4
My understanding is that when you, for example, cat mnt/tabs/by-id/6377/title.txt
in the tab filesystem:
cat
on your computer does a system callopen()
down into macOSor Linux,macOS/Linux sees that this path is part of a FUSE filesystem, so itforwards the
open()
to the FUSE kernel module,FUSE forwards it to the
tabfs_open
implementation in ouruserspace filesystem infs/tabfs.c
,then
tabfs_open
rephrases the request as a JSON string andforwards it to our browser extension over stdout ('nativemessaging'),our browser extension in
extension/background.js
gets theincoming message; it triggers the route for/tabs/by-id/*/title.txt
, which calls the browser extension APIbrowser.tabs.get
to get the data about tab ID6377
, includingits title,so when
cat
doesread()
later, the title can get sent back ina JSON native message totabfs.c
and finally back to FUSE and thekernel andcat
.
(very little actual work happened here, tbh. it's all justmarshalling)
TODO: make diagrams?
License
GPLv3
Screenote
Sponsors
Thanks to all the project sponsors. Specialthanks to:
things that could/should be done
(maybe you can do these? lots of people are already pitching in onGitHub; I wish it was easier for me tokeep up listing them all here!)
add more synthetic files!! (it's justJavaScript)view DOM nodes, snapshot current HTML of page, spelunk into livingobjects. see what your code is doing. make more files writable also
build more (GUI and CLI) tools on top, on both sides
more persistence stuff. as I said earlier, it would also be cool ifyou could put arbitrary files in the subtrees, so .git, Mac extendedattrs, editor temp files, etc all work. make it able to behave likea 'real' filesystem. also as I said earlier, some weirdness in thefact that tabs are so disposable; they have a very differentlifecycle from most parts of my real filesystem. how to nudge that?
why can't Preview open images? GUI programs often struggle with thefilesystem for some reason. CLI more reliable
multithreading. the key constraint is that I pass-s
tofuse_main
intabfs.c
, which makes everythingsingle-threaded. but I'm not clear on how much it would improveperformance? maybe a lot, but not sure. maybe workload-dependent?the extension itself (and the stdin/stdout comm between the fsand the extension) would still be single-threaded, but you couldinterleave requests since most of that stuff is async. like thescreenshot request that takes like half a second, you could do otherstuff while waiting for the browser to get back to you on that (?)update: we aremultithreaded now, thanks tohuglovefan!another issue is that applications tend to hang if anyindividual request hangs anyway; they're not expecting thefilesystem to be so slow (and to be fair to them, they really haveno wayto). some of these problems may be inevitable for any FUSEfilesystem, even ones you'd assume are reasonably battle-tested andwell-engineered like sshfs?
other performance stuff -- remembering when we're already attachedto things, reference counting, minimizing browser roundtrips. notsure impact of these
TypeScript (how to do with the minimum amount of build system andpackage manager nonsense?) (now realizing that if I had gone withTypeScript, I would then have to ask people to install npm andwebpack and the TS compiler and whatever just to get thisrunning. really, really glad I didn't.) maybe we can just do dynamictype checking at the fs op call boundaries?
look into support for Firefox / Windows / Safari /etc.best FUSE equiv for Windows? can you bridge to the remote debuggingAPIs that all of them already have to get the augmentedfunctionality? or just implement it all with JS monkey patching?
window management. tab management where you can move tabs. 'mergeall windows'. history management
hmm
Processes as Files(1984),Julia Evans /proc comic lay out theoriginal
/proc
filesystem. it's very cool! very elegant in how itreapplies the existing interface of files to the new domain of Unixprocesses. but how much do I care about Unix processes now? mostprograms thatI care about running on my computer these days are Web pages, notUnixprocesses. soI want to take the approach of/proc
-- 'expose the stuff you careabout as a filesystem' -- and apply it to somethingmodern: theinside of the browser. 'browser tabs as files'there are two 'operating systems' on my computer, the browser andUnix, and Unix is by far the more accessible and programmable andcohesive as a computing environment (it has concepts that compose!shell, processes, files), even though it's arguably the less importantto my daily life. how can the browser take on more of the propertiesof Unix?
it's way toohard to make abrowser extension. even 'make an extension' is a bad framing; itsuggests making an extension is a whole Thing, a whole Project. like,why can't I just take a minute to ask my browser a question or tell itto automate something? lightness
'files are a sort of approachable 'bridge' that everyone knows howto interact with' / files are like one of the first things you learnif you know any programming language / 'because of this fs thing anybeginner coding thing can make use of it now'
a lot of existing uses of these browser control APIs are in anautomation context: testing your code on a robotic browser as partof some pipeline. I'm much more interested in an interactive,end-user context. augmenting the way I use my everydaybrowser. that's why this is an extension. it doesn't require yourbrowser to run in some weird remote debugging mode that you'd alwaysforget to turn on. it just staysrunning
system call tracing (dtruss orstrace) super useful when anything is going wrong. (need to disableSIP on macOS, though.) the combination of dtruss (application side)& console logging fs request/response (filesystem side) gives a hugeamount of insight into basically any problem, end to end
- there is sort of this sequence that I learned to try withanything. first, either simple shell commands or pure C calls --shell commands are more ergonomic, C calls have the clearestmental model of what syscalls they actually invoke. only then doyou move to the text editor or the Mac Finder, which are a lotfancier and throw a lot more stuff at the filesystem at once (somore can go wrong)
for a lot of things in the extension API, the browser can notify youof updates but there's no apparent way to query the full currentstate. so we'd need to sit in a lot of these places from thebeginning and accumulate the incoming events to know, like, the lasttime a tab was updated, or the list of scripts currently running ona tab
async/await was absolutely vital to making this readable
filesystem as 'open input space' where there are things you can saybeyond what this particular filesystem cares about. (it reminds meof my Screenotate -- screenshots give youthis open field where you can carrythroughstuff that the OCR doesn't necessarily recognize or care about. samefor the real world in Dynamicland; you can scribble notes orwhatever even if the computer doesn't see them)
now you have this whole 'language', this whole toolset, to controland automate your browser. there's this built-up existing capitalwhere lots of people and lots of application software and lots ofprogramming languages .. already know the operations to work withfiles
this project is cool bc i immediately get a dataset i careabout. Ifound myself using it 'authentically' pretty quickly -- to clear outmy tabs, to help me develop other things in the browser so I'd haveactions I could trigger from my editor, ..
Educated a memoir sparknotes. stuff that looks cool / is related:
SQLite virtual tableshave some of the same energy as FUSE synthetic filesystems tome, except instead of 'file operations', 'SQL' is the well-knowninterface / knowledge base / ecosystem that theypiggybackon. osquery seemsparticularly cool
Plan 9. I think a lot about extensibility in the Acme texteditor,whereinsteadof a 'plugin API', the editor just provides a syntheticfilesystem
my fake filesystems talk Review of knives out film full.
Witchcraft has the rightidea for how to set up userscripts. just make files -- don'tmake your own weird UI to add and removethem. (Iguess there is a political or audiencetradeoffhere, where somekinds ofusers might be comfortable with managing files, but you mightalienate others. hmm)
rmdir a non-emptydirectory-- when I was thinking if you should be able to
rm by-id/TABID
even thoughTABID
is a folder. I feel like a new OS, somethinglike Plan 9, shouldgeneralizeits file I/O APIs just enough to avoid problems like this. likedesign them with the disk in mind but also a few concrete cases ofsynthetic filesystems, very slow remote filesystems, etc
do you like setting up sockets? I don't
Screenotate 3.0.0
because of the absence of the chrome.debugger API forextensions.With a bit more plumbing, you could maybe find a way to connect itto the remote debugging protocol in Firefox and other browsers andget that second level of functionality that is currentlyChrome-only. ↩︎
plus some related browsers and platforms: it alsosupports Brave,Vivaldi, FreeBSD, etc. It could probably be made to work onWindows using Dokan or WinFUSE/WSL stuff(?), but I haven'tlooked into that yet. ↩︎
maybe some of these feel a little more vital andfleshed-out and urgent than others. the things I actually wantedto do and reached for vs. the things that satisfy some pedagogicalproperty (simple to explain, stack on top of the previous example,..) ↩︎
it frustrates me that I can't show you, like, a tableof contents for this source file. because it does have a structureto it! so I feel like the UI for looking at this one file shouldbecustom-tailoredtohighlightand exploit that structure. (I wonder what other cases like thisare out there, where ad hoc UI for one file would be useful. likeif you have tangled-but-regular business logic, or the giantopcode switch statement of an emulator or interpreter.)
I want to link you to a particular route and talk about it hereand also have some kind oftransclusion (without the horrifying mess of making a lot of tinyseparate files). I want to use typesetting and whitespace to seteach route in that file apart, and set them as a whole apart from the utility functions &default implementations & networking. ↩︎