Banks are on an unstoppable uncontrolled trajectory in pursuit of KYC over-achievement. That is, they over-collect far more data on people than legally required (before it gets leaked to criminals in data breaches). Banks’ privacy policies are rife with anti-consumer weasel words.
It’s such a shit-show that privacy proponents have no real choice other than to quit banks and operate entirely with cash. Not many people have that level of discipline.
Software can turn this situation around. For example, there are ~6000 privacy-abusing banks and credit unions in the US. If a robot harvests all the privacy policies, fetches AOS apps to check permission reqs, records those with websites MitMd by Cloudflare, and uses all that info to find the lesser of evils, consumers can participate in creating a competition for privacy (as opposed to a competition of meaningless soul-selling fractions of a percent of interest earnings). The heart of the problem is banks are only getting pressure from the side of oppressors and tyranny and no pressure from the side of the people they purport to serve. Software and data can remedy this.
Worth noting that long before the AI bubble started, a university in the US studied bank privacy policies in bulk using a scraper bot that just looked at the standardised privacy disclosure forms for which all banks must conform to a standard layout. The data has rotted by now so their research is not of much use.
I would not trust an “AI” bot to do this kind of task, at least not directly. They are not good at boiling data down while retaining accuracy. It would still have to be a process that some LLMs could help with, but definitely could not work as a one stop shop. Maybe one could help find their policies, and perhaps give individual, separated digestions of how loose or restricted each is, but I would not trust it to accurately boil the data down to a table of facts itself or to even properly identify those ‘facts’.
Doing such an analytical and data producing task such as this is not a great fit for modern “AI”. It does not reason. It does not compare or contrast. It does not correlate. It can spit out sentences to make you think it does those things, but it doesn’t. Implementing “AI” in the actual data producing parts of a task like this just asking for ‘hallucinations’ and all sorts of other problems to corrupt the resulting data beyond any actual useful point you could trust.
Thanks for the insight. Certainly having a human look at the raw data of 6000 banks is a non-starter. I’ve not studied AI, so if I were going to take this project on I would have to (for example) look at what banks charge for paper statements (because offline banking options are a refuge from copious privacy abuses). I would want to short-list banks that offer gratis paper statements.
The phrase “free paper statements” can be worded in many different ways. I might expect an LLM to be good at that sort of thing. In my non-AI approach, I would have to look at a large sample to get an idea of all the different ways that something is expressed then try to write a regular expression to cover them. Is that still the best way?
Phrases like “we value your privacy” and “we only use your data as legally permitted” (which tries to deceive readers into thinking of data minimisation when it really means the opposit) – these can also be worded in many ways, all of which could elevate a /bullshit/ score, of sorts.
One tool I find quite useful for language translation comes from these two sites:
It’s not just a translation of a blob of text, but you enter a short phrase in one language and it finds real instances of the same phrase in the other language, so you can see how one idea can be expressed in many ways within a language. I assumed an LLM was in play but I don’t really know.
Of course what we need is not translation from one language to another but a tool that detects different ways within 1 language to express the same idea; almost like synonyms but for phrases.
LLMs could do that for you, and it could be mostly accurate. The way to use AI and get reliable results is to have it do something that’s verifiable, and then verify it if you want actually reliable data. Like you could definitely have AI try to find the links to docs and features and whatnot as listed on their sites. Just make sure to keep the links it finds around and at least verify they’re what it says it is. You can use an AI to do that step, but LLMs do not actually reason and would be fully capable of false negatives and false positives on any determination no matter how small.
IMO, as long as you made the data fully auditable by hand as easily as possible (and that’d be a great idea for anything referencing externalities), it wouldn’t be bad to use “AI” to construct it. It’d even be a good idea if you want to just trust the AI’s output, as an auditable data set could be updated by AI, too, when policies change and whatnot.
I must be too drunk because I don’t understand a thing about what you’re talking about.
AI bot
AI? Why AI? Why not use a script written in any language like Python?
App permissions
Use a decent custom ROM to control the permissions. That’s it, you can do it now.
TOR-hostility
I hate banks as much as anyone, but they have legal responsibilities and TOR is the worst pain in the ass for them, and easiest thing to block on their end. You can complain about it but it will never change. Also they already have your identity and financial data, there is no need to hide your identity or location from them. Also you can use a VPN most of the time. Why TOR?
they over-collect far more data
Use uBlock Origin, is it not enough?
If a robot harvests
If.
consumers can participate
Consumers don’t care.
TL;DR: use a VPN and uBlock Origin.
AI? Why AI? Why not use a script written in any language like Python?
You tell me. Does python have a library that equates synonymous phrases and patterns in natural languages without AI?
Python is what the researchers used, but they only strictly looked at the standard privacy tables with cells that contain yes, no, or sometimes/depends.
Use a decent custom ROM to control the permissions. That’s it, you can do it now.
You’ve misunderstood. I would never use a bank’s proprietary closed-source spyware exclusively distributed by Google for Google patrons. I envisioned harvesting Exodus Privacy reports to use as input in the assessment of bank’s privacy hostility and abusiveness.
At the same time, the app proposed would obviously not just be for me. Countless boot-licking people are happy to use bank apps. But some may want to know which bank exploits them relatively less. It is not sensible to think all such people have competency to handle custom ROMS. Also, bank apps tend to detect when they are running on a VM or custom ROM and terminate.
I hate banks as much as anyone, but they have legal responsibilities and TOR is the worst pain in the ass for them, and easiest thing to block on their end.
Not my problem. The banks staffed by those with a higher level of competency to handle Tor users should get praise and recognition. What are you claiming is the benefit to the app ignoring the Tor accessibility factor?
You can complain about it but it will never change.
Exactly why the app should look for it. Complaints are useless. Vote with your feet and switch to a bank that has their shit together. Money talks.
Also they already have your identity and financial data, there is no need to hide your identity or location from them.
Nonsense. They potentially¹ have customers’ residential address, not their realtime geolocation at every login. And your ISP does not know where you bank unless you are foolish enough to access your bank from home over clearnet. Note that Trump has ensured that ISPs do not need your consent to collect and sell that data. So today under Trump an ISP can sell where you bank to prospective debt collectors, for example.
¹ E.g. in the US, they need an address, not necessarily a residential address. A homeless person can legally open a bank account using a family member’s address, for example.
Also you can use a VPN most of the time. Why TOR?
Cloudflare treats VPNs with the same hostility as Tor. VPNs also require trust that Tor does not.
Use uBlock Origin, is it not enough?
When your bank asks you for income info face to face, uBlock Origin cannot protect you.
Consumers don’t care.
The app would not be for all consumers. The app can only serve the small minority of consumers who care about privacy. I’m fine with that.
